
Ted Dunning is Chief Application Architect at MapR and has years of experience with machine learning and other big data solutions across a range of sectors. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems. He built fraud detection systems for ID Analytics (later purchased by LifeLock) and he has 24 patents issued to date plus a dozen pending. Ted has a PhD in computing science from the University of Sheffield and is active with open source projects as committer, PMC member, mentor and currently serving as a board member for the Apache Software Foundation. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.
Blog Posts by Ted Dunning
May 17, 2017 | By Ted Dunning
Anomaly Detection in Telecommunications Using Complex Streaming Data | Whiteboard Walkthrough
The telecommunications industry is on the verge of a major transformation through the use of advanced analytics and big data technologies like the MapR Converged Data Platform. The MapR Guide to Big Data in Telecommunications is designed to help you understand...
Read moreMarch 08, 2017 | By Ted Dunning
Keeping Big Data Containers Lightweight
In this week’s Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, explains how to keep big data Docker containers light and agile by moving state into the MapR Converged Data Platform for large scale data persistence that goes beyond...
Read moreJanuary 31, 2017 | By Ted Dunning
Anomaly Detection Using Metrics and Exception Logs | Whiteboard Walkthrough
In this week’s Whiteboard Walkthrough, Ted Dunning, Chief Applications Architect at MapR, will talk about how you can use logs containing metrics and exceptions to detect anomalies in the behavior of a micro-service. For related material on this topic...
Read moreJanuary 04, 2017 | By Ted Dunning
A Better Way to Build a Fraud Detector: Streaming Data and Microservices Architecture | Whiteboard Walkthrough
In this week’s Whiteboard Walkthrough Ted Dunning, Chief Application Architect at MapR, provides some pointers for building better machine learning models, including the advantages of data streams and microservices style design in the example of a credit...
Read moreDecember 06, 2016 | By Ted Dunning
MapR: Converged Advantages in the Cloud | Whiteboard Walkthrough
In this week’s Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, describes advantages of MapR Converged Data Platform and how they work in the cloud. With files, tables and streams engineered into the same technology, MapR has...
Read moreDecember 06, 2016 | By Ted Dunning
MapR: Big Data in the Cloud | Whiteboard Walkthrough
Editor’s Note: Extend to the edge with MapR Orbit Cloud Suite. Go inside MapR Orbit now. In this Whiteboard Walkthrough, MapR Chief Application Architect, Ted Dunning, explains how special capabilities such as mirroring, bi-directional stream and table...
Read moreOctober 05, 2016 | By Ted Dunning
State vs. Flow Data Architecture in the Financial Sector | Whiteboard Walkthrough
In this Whiteboard Walkthrough, MapR’s Chief Application Architect, Ted Dunning, explains the move from state to flow and shows how it works in a financial services example. Ted describes the revolution underway in moving from a traditional system with...
Read moreJuly 27, 2016 | By Ted Dunning
Streaming Data: How to Move from State to Flow - Whiteboard Walkthrough (Part 2)
In this week’s Whiteboard Walkthrough Part II, Ted Dunning, Chief Application Architect at MapR, talks about the design freedom gained by adopting a micro-services architecture based on streaming data. When you move – one step at a time - from an old...
Read moreJuly 27, 2016 | By Ted Dunning
Key Requirements for Streaming Platforms: A Micro-Services Advantage - Whiteboard Walkthrough (Part 1)
In this week’s Whiteboard Walkthrough Part I, Ted Dunning, Chief Application Architect at MapR, explains the key capabilities required of a streaming platform in the context of micro-services and the advantages they offer. Note: This video describes...
Read moreAugust 26, 2015 | By Ted Dunning
Catching Long Tail Distribution – Whiteboard Walkthrough
Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, explains how long tail distribution distorts the appearance of data and how to detect it. Here's the unedited transcript: I'd like...
Read moreAugust 17, 2015 | By Ted Dunning
Some Important Streaming Algorithms You Should Know About
Ted Dunning, Chief Applications Architect for MapR, presented a session titled: “Some Important Streaming Algorithms You Should Know About” at the Spark Summit 2015 conference held in San Francisco. During the session, he highlighted some newer streaming...
Read moreAugust 12, 2015 | By Ted Dunning
Ahead of the Trend: What's Hot? What's Not? – Whiteboard Walkthrough
Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, shows you how to find out what's popular tomorrow by using Zipf's Law. Here's the unedited transcription: Hi, I want to talk about...
Read moreJuly 29, 2015 | By Ted Dunning
How to Show a Mathematician That He Is Wrong – Whiteboard Walkthrough
Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, explains you how to show a mathematician that he is wrong. Here's the unedited transcription: Hi - I'd like to talk to you about...
Read moreJuly 15, 2015 | By Ted Dunning
HDFS vs. MapR FS – 3 Numbers for a Superior Architecture – Whiteboard Walkthrough
Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, talks about the architectural differences between HDFS and MapR-FS that boil down to three numbers. Here's the transcript: Hi! I'd...
Read moreApril 29, 2015 | By Ted Dunning
Better Anomaly Detection with the T-Digest – Whiteboard Walkthrough
Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, gets you up to speed on the t-digest, an algorithm you can add to any anomaly detector to set the number of alarms that you get as a percentage...
Read moreApril 27, 2015 | By Ted Dunning
Revealed: Why Hadoop Uses Three Replicas
There is some real math behind the idea that you need 3x replication in Hadoop. The basic idea is that when a disk goes bad, you lose an entire stripe of storage. Assuming that the contents are spread around the cluster fairly evenly, the contents of...
Read moreApril 01, 2015 | By Ted Dunning
Anomaly Detection with Poisson Distribution – Whiteboard Walkthrough
Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, gets you up to speed on anomaly detection, a simple and easy to implement technique to "figure out stuff that just happened but shouldn...
Read moreMarch 04, 2015 | By Ted Dunning
Time Series Databases in the Upside-down Internet – Whiteboard Walkthrough
Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, talks about how current trends are turning the internet upside down. He also talks about how this is leading to the requirements for very very...
Read moreJanuary 05, 2015 | By Ted Dunning
Real-Time Processing: Why Interactive Queries Aren’t Good Enough
A number of people have been claiming lately that interactive responses to queries constitute real-time processing. For instance, Mike Olson has been quoted saying that interactive queries are what is needed for real-time processing. I like to start...
Read moreDecember 18, 2014 | By Ted Dunning
How Many Disks? Big or Small?
I commonly hear lots of questions about how many drives to use per node in a cluster. For a long time, the norm was to have 4-6 drives per node, but lately, I have been hearing more people suggest 12 drives. At MapR, we have been recommending 12 or 24...
Read moreNovember 08, 2013 | By Ted Dunning
Apache Drill Achieves 1st Milestone Release
The open source incubator project Apache Drill has just made its first release, a significant milestone on the road to graduating to a top-level Apache Software Foundation project. This is a big step that represents a lot of work by the Drill engineering...
Read moreOctober 17, 2013 | By Ted Dunning
Recommender Algorithms: How to Appropriately Use Offline Testing to Prequalify Real-Time Testing
I was recently asked if I was aware of any papers that talk about systematic exploration of parameters for Matrix Factorization (MF). Automatic grid search is pretty standard in this sort of area. In general, however, offline evaluation of recommender...
Read moreSeptember 18, 2013 | By Ted Dunning
Sorting Daily Files with Larger Reference Files: It’s Easier than You Think
A person I know recently asked for advice about an interesting but very common scenario. He wanted to join a large reference file that didn’t get updated very often with a somewhat smaller file that was updated daily, but he wasn’t quite sure how to go...
Read moreAugust 29, 2013 | By Ted Dunning
Improvements in Machine Learning: Apache Mahout 0.8 Release
Machine learning with the open source project Apache Mahout just got better with the much anticipated new Mahout version 0.8, released on July 25, 2013. It’s leaner, with less-used features removed and some powerful new ones added, including improved...
Read moreAugust 21, 2013 | By Ted Dunning
Mahout and MapR: The Perfect Couple
We are often asked by potential customers if Apache Mahout ™ integrates well with the MapR M7 Edition. The quick answer is, "Yes!” Mahout itself is extremely portable, and it easily connects with M7 where appropriate. The advantage of running Mahout...
Read moreOctober 19, 2012 | By Ted Dunning
Quick Start to Apache Drill Project
Apache Drill is coming together rapidly. Lots of progress is being made on multiple fronts as different groups start digging in and as the Apache infrastructure is fleshed out. The progress falls into several categories including community building, coding...
Read moreOctober 19, 2012 | By Ted Dunning
30,000 Times Faster – What Does That Really Mean?
I had the chance to co-present recently with my friend Chao Yuan, of American Express at the New York City Open Analytics Meet Up last month. We discussed lightning-fast nearest neighbor algorithms and associated clustering capabilities. The clustering...
Read moreJune 01, 2011 | By Ted Dunning
MapR Signs Apache Corporate Contributor License Agreement for Apache Foundation
We just announced our signing of the corporate contributor license agreement. This move is part of our exiting from stealth status and means that we will be able to contribute more openly to the Apache, other open source communities and the general ecosystem...
Read moreCategories
50,000+ of the smartest have already joined!
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.