MapR-ES streams contain topics which have logical collections of messages.

Topics in MapR-ES are grouped into streams, to which administrators can apply security, retention, and replication policies. Combined with MapR-FS and MapR-DB in the MapR Converged Data Platform, using these streams enables organizations to create a centralized, secure data lake that unifies files, database tables, and message topics.

Messages (topic data) are published to topics by Producer applications and are read by Consumer applications. All messages published to MapR-ES are persisted, allowing future consumers to “catch-up” on processing and analytics applications to process historical data. Additionally, messages are specifically written to topic partitions.

Note: Topic partitions are stored in containers within volumes. Containers are written to storage pools, which are made up of disks on the nodes in the cluster. See Containers and the CLDB for more information about containers.

Why Use MapR-ES?

MapR-ES is ideal for a variety of use cases, including the following:
Application event pipelines
Many types of applications generate event or log data that must be centrally stored and analyzed to gain insights about user activity or application performance. MapR-ES simplifies these pipelines by transporting events to a central location, from which they can undergo event-by-event transformation and analysis.
Database change capture
Most modern databases enable users to generate an event each time an entry is added or modified. These events can be published to MapR-ES to keep systems like search indexes and caches synchronized, as well as to feed security or notification applications.
Internet of Things
The explosion in the number of smart devices and sensors has created many situations in which billions of data points are created by millions of geographically dispersed sensors. MapR-ES provides a reliable, global transport for these messages, enabling you to perform analytics both at the source and at a central location.


In addition to reliably delivering messages to applications within a single data center, MapR-ES can continuously replicate data between multiple clusters, delivering messages globally. Like other MapR services, MapR-ES has a distributed, scale-out design, allowing it to scale to billions of messages per second, millions of topics, and millions of producer and consumer applications.

Server and Client Libraries

Figure 1: The relationship of the MapR-ES server to producers, consumers, and client libraries
The relationship of the MapR-ES server to producers, consumers, and client libraries
The server manages streams, topics, and partitions and handles requests from the producer client library and the consumer client library.
Producer client library
This client side library which is part of the producer process receives the messages that are sent by producers, buffers the messages, and sends them to the server, which then publishes the messages and sends the client acknowledgements.
Consumer client library
This client side library which is part of the consumer process receives requests from consumers to poll subscriptions for unread messages, reads messages from topic partitions, and sends messages to consumers.