MapR 5.0 Documentation : Using AsyncHBase with MapR-DB Tables

You can use the AsyncHBase libraries to provide asynchronous access to MapR-DB tables. MapR provides a version of AsyncHBase modified to work with MapR-DB tables. Once your cluster is ready to use MapR-DB tables, it is also ready to use AsyncHBase with MapR-DB tables.

After installing the mapr-asynchbase package, the AsyncHBase JAR file asynchbase-1.6.0-mapr-1501-SNAPSHOT.jar is in the directory /opt/mapr/asynchbase/asynchbase-1.6.0/. Add that directory to your Java CLASSPATH.

The Scanner.setMaxNumKeyValues method, when run against MapR-DB tables, does not behave as documented. According to the AsyncHBase documentation, this method sets “the maximum number of KeyValues the server is allowed to return in a single RPC response.”:

If you're dealing with wide rows, in which you have many cells, you may want to limit the number of cells (KeyValues) that the server returns in a single RPC response.

The default is DEFAULT_MAX_NUM_KVS, unlike in HBase's client where the default is -1. If you set this to a negative value, the server will always return full rows, no matter how wide they are. If you request really wide rows, this may cause increased memory consumption on the server side as the server has to build a large RPC response, even if it tries to avoid copying data. On the client side, the consequences on memory usage are worse due to the lack of framing in RPC responses. The client will have to buffer a large RPC response and will have to do several memory copies to dynamically grow the size of the buffer as more and more data comes in.

When you use this method with MapR-DB tables, the value for the maximum number of key values is ignored and the full set of KeyValues is always returned. This issue applies both to AsyncHBase 1.5.0 and 1.60.

See also