In this series of blog posts on the Internet of Things (IoT), we've initially established why IoT naturally lends itself to big data, reviewed the current IoT landscape and had a look at some IoT use cases (smart cities, smart phones, and smart homes). In this post, we'll discuss requirements for an IoT data processing platform as well as introduce a high-level architecture that is able to meet the requirements.
A data platform that needs to process data from IoT devices in a reliable way, at scale should meet the following requirements:
To help architect and consider concrete IoT applications, let's now discuss polyglot processing architecture: the Internet of Things Architecture (iot-a); note that the iot-a in a sense is a meta-architecture, operating on a higher abstraction level than, for example, the Lambda or Kappa architecture. The iot-a assumes that input data—typically time series data—from, say, a sensor is arriving as a stream and that there are (up to) three major query modes in use:
1. output is generated as-it-happens, that is, in a continuous fashion.
2. output is generated based on an interactive query by an end user or another system.
3. output is generated in batches.
In order to satisfy these three outputs, three main building blocks can be used:
1. a Message Queue/Stream Processing (MQ/SP) block
2. a Database (DB) block and
3. a Distributed File System (DFS) block
The MQ/SP block is capable of buffering data, applying some arbitrary business logic, as well as ingesting it into downstream blocks. Further, the DB block provides fine-grained, low-latency access to the data. Due to the nature of the data, the DB block usually utilizes a NoSQL solution. Finally, the DFS block performs batch jobs (aggregations, etc.) over the entire dataset, including integration with unstructured data source (such as images or PDF docs) as well as offering long-term storage (archiving) functionality. Especially in the MQ/SP block we expect to see approximation algorithms, which are eventually corrected by the DFS layer logic.
There are many more aspects around the iot-a that are worthy of a discussion, so we've set up a community advocacy site dedicated to this topic
The iot-a.info site contains more detailed elaboration on the three processing blocks mentioned above, examples for each building block and a summary of time series databases that are available.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.