Faster Application Development with the Open JSON Application Interface (OJAI)

Contributed by

6 min read

MapR developed OJAI (the Open JSON Application Interface) which provides native integration of JSON-like document processing in Hadoop-style scale-out clusters. Some of you might know that OJAI, pronounced OH-hy, is a Native American term for “moon” and is also a charming small town in California. This new API is designed as a general purpose JSON API for all Hadoop systems and frameworks. We currently have implementations for MapR Database as well as JSON-files, and will add more extensions in the near future.

JSON is a popular data format that is critical in the big data world because it’s great for modeling a wide variety of data formats and provides flexible schema. While it is based on JavaScript syntax, its applicability in the market goes well beyond its original use as a structure in JavaScript. Whether your data is hierarchical, nested, or evolving, JSON can capture those formats in a self-describing and human readable way.

This blog is aimed towards application developers and IT architects to help you explore, embrace, and adopt document processing databases using OJAI APIs. I’ll show you how simple and efficient OJAI is and why you should take a closer look. That said, let’s walk through a few quick examples on how OJAI works with MapR Database.

OJAI provides APIs to create an in-memory document structure. A sample JSON document could be info on a bridge:


   “name” : “Golden Gate Bridge”,

   “size” : {“height” : 746, “length” : 8981},

   “built” : “May 27, 1937”,

   “images” :


      {“on” : “Dec 12, 2008”, “link” : “http://short.lnk/1.jpg” },

      {“on” : “Sep 19, 2010”, “link” : “http://short.lnk/3.jpg” },

      {“on” : “Dec 14, 2014”, “link” : “http://short.lnk/44.jpg” }



OJAI includes a backend document store interface referred to as a table which is used to insert and retrieve documents and perform other such CRUD operations. Each such user document is stored as a row in the table, which is accessed using a unique row key. If the above bridge document is to be stored in a table with all bridges in the world, it could have a unique ID of the value of the “name” field in the document or it could be an admin-decided unique value. Let’s pick “GGB” for the uniquifier.

The JSON document above could be created and inserted as follows using the Table API:

Document document = MapRDB.newDocument()

            .set(“name”, “Golden Gate Bridge”)

            .set(“size.height”, 746)

            .set(“size.length”, 8981)

.set(“built”, Values.parseDate(“1937-05-27”)

            .set(“images”, list);

  // NOTE: ‘list’ is an array of document objects - left as an

  // exercise for the reader!

  Table table = MapRDB.createTable(“worldBridges”);

  table.insertOrReplace("GGB", document);

We provide all the examples within this doc with the MapR Database version of the concrete implementation.

When additional information needs to be added to the same document, an update can be performed via a mutation. Each mutation can add new fields, or modify/delete some existing ones. Suppose the bridge color, type, and width need to be added and image at index 2 is found to incorrect and needs delinking from this document, the mutation will look like this:

Mutation mut = MapRDB.newMutation()

             .setOrReplace("size.width", 90)

             .setOrReplace("color", “International Orange”)

        .setOrReplace(“type”, “truss”)


  table.update("GGB", mut);

The mutation is ideally applied on only the server to avoid network traffic. When possible, there should be no read-modify-write on the server itself, which is true for the implementation on MapR Database.

The same document can be retrieved in its entirety using the table findById API:

Document fullDoc = table.findById(“GGB”);

which will be


    “name” : “Golden Gate Bridge”,

    “size” : {“height” : 746, “length” : 8981, “width” : 90},

   “built” : “May 27, 1937”,

   “images” :


      {“on” : “Dec 12, 2008”, “link” : “http://short.lnk/1.jpg” },

      {“on” : “Sep 19, 2010”, “link” : “http://short.lnk/3.jpg”},


   “color” : “International Orange”,

   “type” : “truss”,


Existing container-type fields (like “size”) are merged and non-existing fields (like “color”) are created. Note that the time order of the insertion is maintained and hence the newly added fields are at the end of the document in the MapR Database version of the API implementation.

User applications can also provide specific fields of interest which can satisfy a given condition to get back just a subset of the data from documents on the server. Note that conditions are type-aware. An example condition to get all bridges which are longer than 500 feet (numerical ordering) and is of truss type (lexicographic ordering) would be:

Condition cond = MapRDB.newCondition()


             .is("type", EQUAL, “truss”)

        .is(“size.height”, GREATER, 500)


  table.find(cond, “name”);

This returns an iterable list of all documents with only “name” fields that match the condition. Conditions ideally are evaluated solely on the server of the backend store and only the relevant portion of the data is returned to the application—this is true of the MapR Database implementation. This is more optimal than databases that get all the data to the client side and apply the condition and then prune the data.

There also are Table APIs to delete rows in the table using the unique ID.

This overview should give you a feel for what OJAI provides. Please refer to the OJAI GitHub repository for more information.

This blog post was published September 29, 2015.

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.

Get our latest posts in your inbox

Subscribe Now