Contributor: Ted Dunning

MapR Converge Blog author, Ted Dunning

Ted Dunning is Chief Application Architect at MapR and has years of experience with machine learning and other big data solutions across a range of sectors. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems. He built fraud detection systems for ID Analytics (later purchased by LifeLock) and he has 24 patents issued to date plus a dozen pending. Ted has a PhD in computing science from the University of Sheffield and is active with open source projects as committer, PMC member, mentor and currently serving as a board member for the Apache Software Foundation. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.

Blog Posts by Ted Dunning

May 17, 2017 | By Ted Dunning

Anomaly Detection in Telecommunications Using Complex Streaming Data | Whiteboard Walkthrough

The telecommunications industry is on the verge of a major transformation through the use of advanced analytics and big data technologies like the MapR Converged Data Platform. The MapR Guide to Big Data in Telecommunications is designed to help you understand...

Read more
March 08, 2017 | By Ted Dunning

Keeping Big Data Containers Lightweight

In this week’s Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, explains how to keep big data Docker containers light and agile by moving state into the MapR Converged Data Platform for large scale data persistence that goes beyond...

Read more
January 31, 2017 | By Ted Dunning

Anomaly Detection Using Metrics and Exception Logs | Whiteboard Walkthrough

In this week’s Whiteboard Walkthrough, Ted Dunning, Chief Applications Architect at MapR, will talk about how you can use logs containing metrics and exceptions to detect anomalies in the behavior of a micro-service. For related material on this topic...

Read more
January 04, 2017 | By Ted Dunning

A Better Way to Build a Fraud Detector: Streaming Data and Microservices Architecture | Whiteboard Walkthrough

In this week’s Whiteboard Walkthrough Ted Dunning, Chief Application Architect at MapR, provides some pointers for building better machine learning models, including the advantages of data streams and microservices style design in the example of a credit...

Read more
December 06, 2016 | By Ted Dunning

MapR: Big Data in the Cloud | Whiteboard Walkthrough

Editor’s Note: Extend to the edge with MapR Orbit Cloud Suite. Go inside MapR Orbit now. In this Whiteboard Walkthrough, MapR Chief Application Architect, Ted Dunning, explains how special capabilities such as mirroring, bi-directional stream and table...

Read more
December 06, 2016 | By Ted Dunning

MapR: Converged Advantages in the Cloud | Whiteboard Walkthrough

In this week’s Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, describes advantages of MapR Converged Data Platform and how they work in the cloud. With files, tables and streams engineered into the same technology, MapR has...

Read more
October 05, 2016 | By Ted Dunning

State vs. Flow Data Architecture in the Financial Sector | Whiteboard Walkthrough

In this Whiteboard Walkthrough, MapR’s Chief Application Architect, Ted Dunning, explains the move from state to flow and shows how it works in a financial services example. Ted describes the revolution underway in moving from a traditional system with...

Read more
July 27, 2016 | By Ted Dunning

Key Requirements for Streaming Platforms: A Micro-Services Advantage - Whiteboard Walkthrough (Part 1)

In this week’s Whiteboard Walkthrough Part I, Ted Dunning, Chief Application Architect at MapR, explains the key capabilities required of a streaming platform in the context of micro-services and the advantages they offer. Note: This video describes...

Read more
July 27, 2016 | By Ted Dunning

Streaming Data: How to Move from State to Flow - Whiteboard Walkthrough (Part 2)

In this week’s Whiteboard Walkthrough Part II, Ted Dunning, Chief Application Architect at MapR, talks about the design freedom gained by adopting a micro-services architecture based on streaming data. When you move – one step at a time - from an old...

Read more
August 26, 2015 | By Ted Dunning

Catching Long Tail Distribution – Whiteboard Walkthrough

Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, explains how long tail distribution distorts the appearance of data and how to detect it. Here's the unedited transcript: I'd like...

Read more
August 17, 2015 | By Ted Dunning

Some Important Streaming Algorithms You Should Know About

Ted Dunning, Chief Applications Architect for MapR, presented a session titled: “Some Important Streaming Algorithms You Should Know About” at the Spark Summit 2015 conference held in San Francisco. During the session, he highlighted some newer streaming...

Read more
August 12, 2015 | By Ted Dunning

Ahead of the Trend: What's Hot? What's Not? – Whiteboard Walkthrough

Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, shows you how to find out what's popular tomorrow by using Zipf's Law. Here's the unedited transcription: Hi, I want to talk about...

Read more
July 29, 2015 | By Ted Dunning

How to Show a Mathematician That He Is Wrong – Whiteboard Walkthrough

Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, explains you how to show a mathematician that he is wrong. Here's the unedited transcription: Hi - I'd like to talk to you about...

Read more
July 15, 2015 | By Ted Dunning

HDFS vs. MapR FS – 3 Numbers for a Superior Architecture – Whiteboard Walkthrough

Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, talks about the architectural differences between HDFS and MapR-FS that boil down to three numbers. Here's the transcript: Hi! I'd...

Read more
April 29, 2015 | By Ted Dunning

Better Anomaly Detection with the T-Digest – Whiteboard Walkthrough

Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, gets you up to speed on the t-digest, an algorithm you can add to any anomaly detector to set the number of alarms that you get as a percentage...

Read more
April 27, 2015 | By Ted Dunning

Revealed: Why Hadoop Uses Three Replicas

There is some real math behind the idea that you need 3x replication in Hadoop. The basic idea is that when a disk goes bad, you lose an entire stripe of storage. Assuming that the contents are spread around the cluster fairly evenly, the contents of...

Read more
April 01, 2015 | By Ted Dunning

Anomaly Detection with Poisson Distribution – Whiteboard Walkthrough

Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, gets you up to speed on anomaly detection, a simple and easy to implement technique to "figure out stuff that just happened but shouldn...

Read more
March 04, 2015 | By Ted Dunning

Time Series Databases in the Upside-down Internet – Whiteboard Walkthrough

Editor's Note: In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, talks about how current trends are turning the internet upside down. He also talks about how this is leading to the requirements for very very...

Read more
January 05, 2015 | By Ted Dunning

Real-Time Processing: Why Interactive Queries Aren’t Good Enough

A number of people have been claiming lately that interactive responses to queries constitute real-time processing. For instance, Mike Olson has been quoted saying that interactive queries are what is needed for real-time processing. I like to start...

Read more
December 18, 2014 | By Ted Dunning

How Many Disks? Big or Small?

I commonly hear lots of questions about how many drives to use per node in a cluster. For a long time, the norm was to have 4-6 drives per node, but lately, I have been hearing more people suggest 12 drives. At MapR, we have been recommending 12 or 24...

Read more
November 08, 2013 | By Ted Dunning

Apache Drill Achieves 1st Milestone Release

The open source incubator project Apache Drill has just made its first release, a significant milestone on the road to graduating to a top-level Apache Software Foundation project. This is a big step that represents a lot of work by the Drill engineering...

Read more
October 17, 2013 | By Ted Dunning

Recommender Algorithms: How to Appropriately Use Offline Testing to Prequalify Real-Time Testing

I was recently asked if I was aware of any papers that talk about systematic exploration of parameters for Matrix Factorization (MF). Automatic grid search is pretty standard in this sort of area. In general, however, offline evaluation of recommender...

Read more
September 18, 2013 | By Ted Dunning

Sorting Daily Files with Larger Reference Files: It’s Easier than You Think

A person I know recently asked for advice about an interesting but very common scenario. He wanted to join a large reference file that didn’t get updated very often with a somewhat smaller file that was updated daily, but he wasn’t quite sure how to go...

Read more
August 29, 2013 | By Ted Dunning

Improvements in Machine Learning: Apache Mahout 0.8 Release

Machine learning with the open source project Apache Mahout just got better with the much anticipated new Mahout version 0.8, released on July 25, 2013. It’s leaner, with less-used features removed and some powerful new ones added, including improved...

Read more
August 21, 2013 | By Ted Dunning

Mahout and MapR: The Perfect Couple

We are often asked by potential customers if Apache Mahout ™ integrates well with the MapR M7 Edition. The quick answer is, "Yes!” Mahout itself is extremely portable, and it easily connects with M7 where appropriate. The advantage of running Mahout...

Read more
October 19, 2012 | By Ted Dunning

30,000 Times Faster – What Does That Really Mean?

I had the chance to co-present recently with my friend Chao Yuan, of American Express at the New York City Open Analytics Meet Up last month. We discussed lightning-fast nearest neighbor algorithms and associated clustering capabilities. The clustering...

Read more
October 19, 2012 | By Ted Dunning

Quick Start to Apache Drill Project

Apache Drill is coming together rapidly. Lots of progress is being made on multiple fronts as different groups start digging in and as the Apache infrastructure is fleshed out. The progress falls into several categories including community building, coding...

Read more
June 01, 2011 | By Ted Dunning

MapR Signs Apache Corporate Contributor License Agreement for Apache Foundation

We just announced our signing of the corporate contributor license agreement. This move is part of our exiting from stealth status and means that we will be able to contribute more openly to the Apache, other open source communities and the general ecosystem...

Read more
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now