MapR Edge for the Internet of Things

MapR Edge for the Internet of Things

From smart thermostats to motion-sensing lights, the Internet of Things (IoT) has already weaved itself into many aspects of our daily lives. Estimates at the low-end call for nearly 25 billion IoT devices by 2020—more than 3 devices for every man, woman, and child on the planet. While consumers will continue to benefit from this trend—in the form of sensor-enabled home energy-savings, for example—so too will organizations. From automobile manufacturers to oil and gas companies, businesses across the globe are placing big bets on the “industrial IoT.” They seek to derive real business value from outcomes like predicting equipment failures, avoiding accidents, improving diagnostics, and more. One growing requirement of the industrial IoT is to have computing power available close to the data sources. Unlike consumer IoT where the volume of data generated by each device is typically low, industrial IoT sources create a significant amount of data. Unfortunately, many of these deployments are hamstrung by technical constraints and other limitations that prevent businesses from maximizing their IoT investment.

One such limitation is that of constrained physical space. In many situations, it is better for data from IoT systems to be processed locally, since sending data to the cloud or other remote facility for analysis would introduce unacceptable delays. Yet space constraints make it impossible to house a full—or even partial—rack of servers alongside IoT sensors to perform this local processing. For example, consider vehicles equipped with advanced driver assistance systems (ADAS) to avoid collisions. Given that public safety is on the line, data from ADAS must be analyzed quickly and locally. The problem is that cars can hardly allocate more than a cubic foot or two of trunk space to store the necessary computing equipment. Of course, it’s possible to use smaller form-factor computers to address part of this problem, but the enterprise-grade software required to run the necessary analytics has not been able to run on this smaller hardware, at least not historically.

Another challenge in some industrial IoT use cases is limited bandwidth, either because the network “pipes” are smaller than the generated data requires or because network connections are intermittent. In either case, limited bandwidth makes it harder for sensor data to reach the cloud (or on-premises data center), where it can be combined with other data for deeper processing. Oil rigs, for example, are often located in remote locations where internet-connectivity is a luxury, made possible only by wifi-equipped trucks that periodically drive around the facility. Sensors on these rigs are used to better predict imminent equipment failure to help reduce non-productive time and to optimize oilfield production, but these outcomes could be vastly improved if sensor data from multiple rigs were analyzed in aggregate—a task made more difficult or even impossible due to bandwidth limitations. One way to address this problem is simply to down-sample or summarize the data before sending. In many situations this solution works well, because the subset of data is sufficient for analysis, but it is only practical when there is full processing power at the edge.

Finally, privacy or compliance requirements—such as those driven by data residency regulations in the EU—might dictate that some data needs to stay at the IoT edge and not be copied to other locations for further processing. Enforcement of these policies is problematic, since there is often no good way of differentiating between data that needs to stay put and data that can be moved around. In other cases, data can and should be processed at a more central cluster with bigger compute resources, where deeper analytics can be performed on data coming in from many different edge devices. Consider this situation in the context of telemedicine, for example. While the output of medical devices at one edge can be used to achieve some basic diagnosis, more accurate diagnoses can be achieved by analyzing output from many such medical devices— potentially spread throughout the world. So it would be important to aggregate and analyze this output in a central, more powerful cluster and return the results immediately back to the medical center.


MapR Edge helps organizations realize the full potential of their IoT investment by addressing these challenges and more. It allows customers to deploy an architecture in which they “act locally, learn globally.” Addressing the need to capture, process, and analyze data generated by IoT devices close to the source, MapR Edge provides secure local processing, quick aggregation of insights on a global basis, and the ability to push intelligence back to the edge for a faster and more significant business impact.

MapR Edge is a fully-functional MapR cluster that can be run on small form-factor commodity hardware (such as Intel NUCs). Edge clusters are supported in three- to five-node configurations, with each boasting converged enterprise data services (e.g., files, tables, streams, Drill, Spark), along with related data management and protection capabilities (e.g., security, snapshots, mirroring, replication, and compression).


MapR Edge for IOT architecture diagram

Figure 1. MapR Edge Architecture Diagram

MapR Edge can be applied to many different use cases but is especially well-suited to IoT environments where

  • Space is limited, preventing deployment of a full rack or cluster of nodes to store and analyze the data at the IoT source
  • Bandwidth is constrained, meaning network connectivity is not always available or is limited
  • Data needs to be processed for real-time action at the IoT source, and/or raw data needs to be kept at the IoT source

Nearly every industry has (or can make use of) IoT deployments meeting these criteria. For example

  • Retailers seeking to enhance in-store customer experience through personalized digital coupons1
  • Agriculture, where drones are being used to help farmers increase their ROI through better crop yields
  • Defense, where drones are being used to improve awareness of the battlefield
  • Renewable Energy firms wanting to optimize wind turbines and maximize power generation
  • Smart cities looking to increase energy savings through smarter light fixtures and other sensor-equipped devices

1The Internet of Things: Revolutionizing the Retail Industry, Accenture

We illustrate two important use cases in the oil and gas and automotive industries below.


With crude oil prices at historic lows, oil and gas companies are actively looking for ways to cut production costs and streamline their operations. These companies are investing heavily in technology, including IoT, to improve their bottom line. For example, sensors attached to various parts of oil rigs are being used to better predict oil production and to position drills for maximum output.

Oil and Gas Iot Challenges

  • Oil rig sensors generate large quantities of data, with some estimating close to 1.5 TB2 (from 19 million readings3) per day
  • Oil rigs are often in remote locations with limited bandwidth, preventing global aggregation and analysis of the data

MapR Edge addresses these challenges as shown in Figure 2.

MapR Edge use case - oil and gas industry

Figure 2. MapR Edge Helps Oil and Gas Companies Better Predict Oil Production, Even When Rigs are in Remote Locations Where Bandwidth is Constrained

2IoT Use Cases in the Oil and Gas Industry, RCR Wireless News, Phillip Tracy

3Internet of Things Examples from 3 Industries, SAS, Alison Bolen


As Advanced Driver Assistance Systems (ADAS) gain momentum, automobile manufacturers are equipping vehicles with sensors aimed at avoiding collisions, assisting with automatic parking, and more. Beyond ADAS use cases, IoT sensors are also being installed in vehicles to help predict maintenance needs and avoid breakdowns.

Automotive IoT Challenges

  • Vehicles are space-constrained. Large form factor servers, typically found in data centers, will not be found in cars.
  • Vehicles have limited network connectivity, with some data “uploads” occurring only when the vehicle is brought in for service

MapR Edge addresses these challenges as shown in Figure 3.

MapR Edge use case - automotive industry

Figure 3. MapR Edge Helps Automobile Manufacturers Build Better ADAS and Better Predict Maintenance Needs, Even When Space and Bandwidth are Constrained.


  • Distributed data aggregation Provides high-speed local processing, especially useful for location-restricted or sensitive data, such as personally identifiable information (PII), and consolidates IoT data from edge sites
  • Bandwidth-awareness Adjusts throughput from the edge to the cloud and/or data center, even with occasionally-connected environments
  • Global data plane Provides global view of all distributed clusters in a single namespace, simplifying application development and deployment
  • Converged analytics Combines operational decision-making with real-time analysis of data at the edge

MapR Edge data convergence

Figure 4. Convergence at the Edge: Files, Tables, Streams, and Analytics. All close to Your IoT Source.

  • Unified security End-to-end IoT security provides authentication, authorization, and access control from the edge to the central clusters. MapR Edge also delivers secure encryption on the wire for data communicated between the edge and the main data center.
  • Standards-based MapR Edge adheres to standards, including POSIX and HDFS API for file access, ANSI SQL for querying, Apache KafkaTM API for event streams, and HBase and OJAI API for NoSQL database.

MapR Edge API supports same API standards as MapR Platform

Figure 5. MapR Edge Clusters Support the Same Standards-Based APIs as MapR Converged Enterprise Edition

  • Enterprise-grade reliability Delivers a reliable computing environment to tolerate multiple hardware failures that can occur in remote, isolated deployments


IoT promises to bring great value to a diverse set of industries and use cases, leading organizations across the board to ramp up their investments in this important technology. As IoT deployments accelerate, many will notice that challenges at the edge—including limited bandwidth and space constraints - are limiting the technology’s full potential. MapR Edge works within these constraints, allowing organizations to “act locally, learn globally.”