MapR 6.1 Simplifies the Development of AI and Analytics Applications

Contributed by

7 min read

MapR 6.1

MapR has just announced the 6.1 release of our data platform. This release comes with many new features and functionalities in MapR Database and MapR Event Store, focused on improving the developer experience, and provides new tools to build innovative applications. MapR 6.1 comes with the following enhancements:

MapR Database (an HBase Binary and Document Database):

  • Advanced Querying and Indexing of JSON arrays of scalar or sub-documents (aka Complex Types)
  • New programming language clients for Python and Node.js developers, based on open protocol to ease the development of additional clients
  • Fine-grained monitoring for tables that capture and expose detailed metrics into MCS
  • JSON Change Data Capture format to ease the consumption of database events in your application

MapR Event Store (a Kafka API-Based Pub/Sub System)

  • Support of Apache Kafka 1.1 API
  • Stream Processing with KStreams
  • Stream SQL/Analytics with KSQL

Let's dive into these features and see how they can help you to modernize your applications.

Querying and Indexing JSON Arrays (Complex Types)

MapR Database is the highly scalable NoSQL database, built within the MapR Data Platform, that allows developers to build applications, using the HBase API based on a column-family model and also using a JSON-based API ( OJAI). In 6.0, we had introduced the support of "secondary indexing" to offer efficient queries on any JSON attribute, and in 6.1 we added the support of indexing on JSON arrays as well as adding new query operators.

With MapR 6.1, developers can now harness the full power of JSON to design complex data models with nested structures and arrays and can use these structures in OJAI queries with the new array operator. For example, let's say you have a table that contains artists represented as JSON documents. Each artist has an "albums" attribute that contains the list of all his albums stored as JSON documents; the following OJAI query will search all the artists to find the artist that has an album named "Abbey Road":

find /apps/artists --c '{"$eq":{"albums[].name":"Abbey Road"}}

Python and Node.js Client and New Client Architecture

The MapR team is always listening to its community, and we received feedback about the need to provide access to MapR Database JSON using "your favorite" programming language, starting with Python and Node.js/Javascript. We began in the previous 6.0.1 release by exposing the database using a REST endpoint; now in 6.1, we have added a native support for Python and Node.js. These new idiomatic MapR Database JSON clients are lightweight and can be added to your project using pip or npm, offering the best developer experience and allowing applications to be deployed in any environment.

The following code snippet shows how you can query MapR Database JSON directly with Python. This sample does a find query to the /my\_app/user\_profiles, searching for all the users with an age between 26 and 35, using the OJAI condition syntax.

connection =

document_store = connection.get_store(store_path='/my_app/user_profiles')

query_result = document_store.find(
    {'$select': ['*'], '$where': {'$and': [{'$ge': {u'age': 26}}, {'$le': {u'age': 35}}]}})

for d in query_result:

To create these new clients for MapR Database JSON, a new architecture has been created to expose MapR Database JSON operations, using a simple and open communication protocol based on gRPC. This approach will let the MapR team and its community members easily add new programming languages like Go, C#, and a lightweight Java library.

The following diagram explains the new architecture at a high level:

Data Access Gateway

Support of Apache Kafka 1.1

Apache Kafka has became the de facto standard for event streaming. MapR Event Store not only supports this API but provides a better streaming architecture with better storage, replication, and security than Kafka. In the 6.1 release of the MapR Data Platform, the API has been updated to match the Apache Kafka version 1.1. The main reason for this update is to ease the development and migration of applications from vanilla Apache Kafka to MapR Event Store. It is also interesting to mention that Apache Kafka 1.1 support brings "exactly once" semantics with the idempotent producer and better control of the messages retention with the log compaction.

KStreams and KSQL

MapR 6.1 and its associated MapR Expansion Pack (MEP) come with KStreams and KSQL, which simplifies the development of stream processing applications.

  • KStreams is a set of API that allows you to easily process messages from a topic. So, in the 6.1 release, depending on your expertise and proficiency with streaming protocols, the complexity of the code you would have to write, and where in the dataflow the processing should happen, you can chose to implement the stream processing logic using the lightweight KStreams API or the more complete Apache Spark Streaming.
  • KSQL provides an easy way to do real-time processing of Kafka messages using simpler SQL syntax; developers can read or write messages and also take advantage of advanced query capabilities, such as join and event time windowing - all without writing complex code.

But Wait, There's More!

This quick summary of the new features of MapR Database and MapR Event Store merely shows you some of the enhancement done for developers. But they are also other enhancements done in the other parts of the platform, such as easier access for files (NFSv4, S3, and such), object tiering, encryption at rest, and many more, that will help you with real-time applications, analytics, or artificial intelligence/machine learning use cases. Read Ted Dunning's blog for more information.

There has never been a better time to try out MapR.

This blog post was published July 06, 2018.