Consuming CDC Records

Changed data records (propagated by the Change Data Capture feature) are consumed by using OJAI changelog interfaces.

The following topics provide information you need to understand the CDC feature, to setup and use CDC, the maprcli commands used performing tasks, and to consume the data via your application.

The following diagram shows the big picture flow of setting up CDC and consuming CDC changed data records. This diagram provides hotspot links.

Learning about CDC Administering Change Data Capture Building a consumer app for CDC Using dbshell to perform CRUD operations on MapR-DB JSON tables Developing client applications for MapR-DB JSON tables. Using hbshell to perform CRUD operations on MapR-DB binary tables. Developing client applications for MapR-DB binary tables.



Deserializer for consuming CDC records

The MapR ChangeData deserializer must be registered when creating a CDC consumer. The deserializer converts the stream messages into individual change data records. It is a MapR custom Kafka consumer that polls for Map-ES changed data records from a MapR-ES stream topic. The deserializer is registered by setting the value.deserializer configuration parameter. to com.mapr.db.cdc.ChangeDataRecordDeserializer.
Note: When consuming from a CDC change topic, the record's key retrieved from poll() is not deserialized. In another words, the record key is not equal to the _id field of the document. If you want to retrieve the exact _id of the document, call the ChangeDataRecord.getId() method.

Interfaces for working with CDC records

The following OJAI interfaces and enumerations are used to create a consumer for CDC changed data.

  • ChangeNode - This interface contains the change to a single field in a document.
  • ChangeEvent - This interface identifies the change event associated with the current change node. The values of ChangeEvent can be one of the following:
    • NULL (no event)
    • NODE (a change with real value)
    • START_MAP (a node representing the beginning of a map)
    • END_MAP (a node representing the end of a map)
    • START_ARRAY (a node representing the beginning of an array)
    • END_ARRAY(a node representing the end of an array)
  • ChangeOp - This interface identifies the type of the operation made on the current field. The values of ChangeOp can be one of the following:
    • NULL (no operation)
    • SET (replace the current field with the given value)
    • PUT(add an extra version of the value)
    • MERGE (combine the given value with the existing values in the table)
    • DELETE (delete all values older than or equal to the delete operation timestamp were deleted)
    • DELETE_EXACT (delete the version of the value with the given timestamp)
  • ChangeDataRecord - This interface contains all the changes that were performed on a single document/row in the source table.
  • ChangeDataRecordType - This interface specifies the mode of change for the change data record. The following values are specified:
  • ChangeDataReader - This interface is a parser that traverses over the individual change tree nodes on a change data record. It provides a cursor-like semantics which can be moved, one tree node at a time by invoking the next method.

    The properties of individual change node (for example: data type, field name, field value, and so on) can be retrieved using various methods of this interface.