MapR-DB Binary Connector for Apache Spark

Apache Spark is a software framework that is used to process data in memory in a distributed manner. Spark is replacing MapReduce in many use cases.

This section describes the four main interaction points between Spark and HBase and provides examples for each interaction point. The interaction points are:

Basic Spark You can have an HBase Connection at any point in your Spark DAG.
Spark Streaming You can have an HBase Connection at any point in your Spark Streaming application.
Spark Bulk Load You can write directly to HBase HFiles for bulk insertion into HBase.
SparkSQL/DataFrames You can write SparkSQL that draws on tables that are represented in HBase.

The following pages provide examples of each of these interaction points.