Integrate Spark-SQL (Spark 2.3.1 and later) with Avro

You integrate Spark-SQL with Avro when you want to read and write Avro data. This information is for Spark 2.3.0 or later users.

Note: For Spark 2.2.1 and 2.3.1 versions, use the 4.0.0 avro version of com.databricks:spark-avro_2.11.

Use the following steps to perform the integration. Previous versions of Spark do not require these steps.

  1. Download the Avro 1.7.7 JAR file to the Spark jars (opt/mapr/spark/spark-<version>/jars) directory.
    You can download the file from the maven repository: http://mvnrepository.com/artifact/org.apache.avro/avro/1.7.7
  2. Add the following properties in spark-defaults.conf:
    spark.driver.extraClassPath /opt/mapr/spark/spark-2.3.1/jars/avro-1.7.7.jar
    spark.executor.extraClassPath /opt/mapr/spark/spark-2.3.1/jars/avro-1.7.7.jar