Integrate Spark-SQL (Spark 2.0.1 and later) with Avro

You integrate Spark-SQL with Avro when you want to read and write Avro data. This information is for Spark 2.0.1 or later users.

Use the following steps to perform the integration. Previous versions of Spark do not require these steps.

  1. Download the Avro 1.7.7 JAR file to the Spark jars (opt/mapr/spark/spark-<version>/jars) directory.
    You can download the file from the maven repository:
  2. Add the following properties in spark-defaults.conf:
    spark.driver.extraClassPath /opt/mapr/spark/spark-2.0.1/jars/avro-1.7.7.jar
    spark.executor.extraClassPath /opt/mapr/spark/spark-2.0.1/jars/avro-1.7.7.jar
  3. Launch Spark Shell with the following arguments:
    For Spark 2.0.1:
    /opt/mapr/spark/spark-<version>/bin/spark-shell \
    --packages com.databricks:spark-avro_2.11:3.0.1 \
    --master <master-url>
    For Spark 2.1.0:
    /opt/mapr/spark/spark-<version>/bin/spark-shell \
    --packages com.databricks:spark-avro_2.11:3.2.0 \
    --master <master-url>