Enterprises can no longer wait weeks or even days to generate analytics and to uncover new business insights. To compete in the current market, enterprises must continuously acquire data on their customers, suppliers, partners, and competitors in their market.
The digital transformation is now delivering systems, smart devices out on the edge and new applications to capture and analyze massive data-sets, all in real time. This enables businesses to leverage data from traditional data warehouses as well as new data streams coming in from the edge.
This trend is not unique to a few industries, but goes across a wide swath of industry use cases, including Financial Services, Healthcare, Manufacturing, Retail, Telecom, to name a few. The driving objective is to exploit new ingest streams and to make real time decisions on this data, achieving new operational efficiencies, identifying new revenue streams and ultimately, improving customer satisfaction. Some of the key challenges customers face in attempting to make this transition would include:
Historically the data lake has been a prominent unified platform used to store enterprise data, including raw copies of source system data and transformed data used for tasks such as reporting, visualization and batch analytics.
More recently, the industry requires integrating these data lakes with real-time data sources such as smart car sensors, financial transactions, machine log data, traffic sensors, geo-spatial data, social media and web clickstreams. Many of these data sources trigger events and in turn create event streams. The data is generated at a very high volume and needs to be processed as soon as possible to provide real-time results with extremely low latency.
Ingest and analysis of such data demands a robust data pipeline with purpose built infrastructure. The majority of legacy infrastructure used today is inadequate in addressing this need. This would also require several distributed applications to be linked together in real-time as newer larger data sets arrive.
With stream processing growing at a CAGR of 32% annually, it's the fastest growing software category for Big Data Analytics. Legacy infrastructure used to support Hadoop data lakes will be inadequate for real-time data lake enablement.
HPE's Elastic Platform for Analytics (EPA) platform provides a comprehensive data pipeline for Edge- to-Core-to-Data Lake infrastructure for Real Time Analytics. Some critical features offered by this platform include:
HPE's EPA platform is an infrastructure optimized for the MapR Data Platform and provides end- to-end capabilities for a robust data pipeline. Key features of the MapR Data Platform include:
Figure 1: Real time analytics building blocks
MapR Streams provides a robust data pipeline with topics that can help organize events into categories. They are logical collections of messages managed by MapR-Event Store. Producers deliver data to topics and consumers subscribe to the topic to consume messages and there by analyze the messages further.
HPE Apollo systems provide purpose-built infrastructure with a high density compute platform in a 2U chassis with HPE Apollo 2000, designed for Streaming Analytics, as well as a dense storage platform with the HPE Apollo 4200 optimized for the Data Lake and tiered storage. Finally the HPE Apollo 6500 delivers a purpose built, highly dense GPU platform optimized for AI workloads.
HPE's Elastic Platform for Analytics (EPA) platform provides a comprehensive data pipeline for Edge- to-Core-to-Data Lake infrastructure for Real Time Analytics.
MapR enables cohesive data pipeline integrated messaging, micro-batching and data storage technology with MapR-ES, MapR-FS and MapR-DB.
MapR-ES provides the critical functionality to enable real time streaming, built around HPE's EPA architecture.
For more information on this solution:
MapR enables cohesive data pipeline integrated messaging, micro-batching and data storage technology with MapR-Event Store, MapR XD and MapR-Database. The advantages of adopting the combined MapR and HPE solution are summarized below:
Learn more at hpe.com