IoT Spotlight: Why did we bring an Arduino to Hadoop Summit?

Contributed by

4 min read

I was at the annual Hadoop Summit in San Jose last week. As usual, the MapR booth was buzzing with big data enthusiasts and experts alike. We showcased demos that spanned multiple topics including multi-cluster Hadoop monitoring using Grafana and Kibana (as part of our new Spyglass Initiative), IoT stream analysis using MapR Event Store and Spark Streaming, and self-service big data analytics using Apache Drill. It was a privilege being a part of insightful conversations around the aforementioned open source community projects, and I look forward to being more involved in the coming days.

MapR at Hadoop Summit

Hadoop shows signs of maturity

Hadoop has become a mainstay in enterprise IT and it was very evident from the interactions we had with the booth visitors. We have gone from “What is Hadoop?” to “How do we monitor Hadoop deployments?”, “What are the differentiators between MapR and Hortonworks?”, “How do we measure ROI on Hadoop?”, “Can you share Hadoop customer success stories?” as the conversation starters. The signs are encouraging for big data, and yet we stand at crossroads again with a new unknown – IoT.

Internet of Things – Can I see it working?

This is why we brought an Arduino Uno to Hadoop Summit. Internet of things is a very simple concept to understand, a lot of sensors and devices connected to the internet and transmitting and receiving data. At the same time, it is little too abstract to visualize when we talk about IoT in the context of big data or stream processing. How does the sensor data get to the stream processing engines? What do the end results look like? What architectural components are required? We decided to tackle these fundamental questions head-on.

The result was simple and elegant:

MapR at Hadoop Summit

What you are looking at is an end-to-end IoT stream processing demo – from sensor to dashboard. We are trying to emulate a real-world scenario where a temperature sensor is sending readings for stream processing in real-time. This scenario can be related to oil rigs, manufacturing equipment, connected cars, and a multitude of other examples where temperature provides a valuable input about the overall operation. Sure, we would have a lot more sensors involved, but the architectural principles would remain the same.

We have a temperature sensor (dipped in a glass of hot water in the picture above) connected to an Arduino. The Arduino board runs an MQTT client to transmit temperature data to the MQTT server, using ESP8266 chip for wireless transmission. The MQTT server then publishes the data into a MapR Event Store topic. We have a Spark Streaming application running as a MapR Event Store consumer, which reads the data, performs necessary transformations, and then persists the data in MapR Database. Grafana is used to render time-series visualizations, which is what you are seeing on the big screen in the picture above. You could dip the temperature sensor in a glass of ice-cold water, and see the time-series graph reflect the sudden drop of temperature.

I had many interesting conversations about the architecture and demo setup with excited booth visitors. To those who visited us, thank you for stopping by our booth. I believe there is a genuine interest in the big data circles to understand architectural components of IoT. I also understand that to many of you the above paragraph reads like a lot of mumbo-jumbo about a variety of components, which include both hardware and software.

I will discuss the architectural components in detail, and try to explain the role each component plays in an upcoming blog. Stay tuned! Watch this space!

This blog post was published July 08, 2016.

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.

Get our latest posts in your inbox

Subscribe Now