Live Tutorial Streaming Real-Time Events Using Apache APIs

For this talk we will explore the power of streaming real time events in the context of the IoT and smart cities.

We will look at a solution that combines real-time data streams with iteractive machine learning to analyze and visualize popular Uber trip locations in New York City. Ingestion of the real time data (location, date,time), analyzing it to provide location clusters, as well as providing real time dashboards will all be covered. You will see the end-to-end process required to build this application using Apache APIs for Kafka, Spark, HBase and other technologies.

According to Gartner, by 2020, smart cities will be using about 1.39 billion connected cars, IoT sensors and devices. The analysis of behavior patterns within cities will allow optimization of traffic, better planning decisions, and smarter advertising. You may be excited about the possibilities of exploiting data streams to gain actionable insights from continuously produced data in real-time but you may find it difficult to conceptualize how to implement such a solution and how this can fit into your business. In this presentation, we will walk you through an architecture that combines data streaming with machine learning to enhance a Uber service with an ability to analyze and visualize the most popular taxi pick-up/drop-off locations by date and time so that drivers' locations could be optimized and priced according to demand.

  • Part 1 Spark machine learning
  • Part 2 Kafka and Spark Streaming
  • Part 3 Real time dashboard using Vert.x
  • Part 4 Spark Streaming, Dataframes and HBase


Carol McDonald
Solutions Architect MapR