Spark 2.2.1-1808 (MEP 5.0.1) Release Notes

This section provides reference information, including new features, patches, and known issues for Spark 2.2.1-1808.

The notes below relate specifically to the MapR Distribution for Apache Hadoop. You may also be interested in the open-source Spark 2.2.1 Release Notes.

Spark Version 2.3.1
Release Date September 2018
MapR Version Interoperability See MEP Components and OS Support.
Source on GitHub https://github.com/mapr/spark/tree/2.2.1-mapr-1808
GitHub Release Tag 2.2.1-mapr-1808
Maven Artifacts http://repository.mapr.com/maven/
Package Names Navigate to https://package.mapr.com/releases/MEP/ and select your MEP and OS to view the list of package names.
Important:
  • Spark 2.2 can connect to Hive Metastore 2.1. But, features of Hive added after Hive 1.2 are not supported by Spark.
  • Starting from Spark 2.2.1 and MEP 5.0.0 Kafka version is updated to 1.0.1.
  • MapR 6.0 and MEP 5.0 and later introduce security by default. If you are using these versions and enable security on your MapR cluster, MapR scripts automatically configure Spark security features.

Hive Support

This version of Spark supports integration with Hive. However, note the following exceptions:

Patches

This MapR release includes the following new patches since the latest MapR Spark 2.2.1 release. For details, refer to the commit log for this project in GitHub.

GitHub Commit Date (YYYY-MM-DD) Comment
6adbb83 2018/04/25 [MAPR-31202] Spark History Server bug fixed. Redundant 'jsr311-api' artifact excluded
bbfaa66 2018/05/03 [MAPR-31305] Spark History server NOT loading applications submitted by users other than 'mapr'
e2953fc 2018/05/10 MapR [SPARK-227] KafkaUtils.createDirectStream fails with kafka-09
f747793 2018/05/22

MapR [SPARK-244] Added impersonation for history server

1b45342 2018/05/26 MapR [SPARK-226] Spark - pySpark Security Vulnerability
ff78e8c 2018/05/30 MapR [SPARK-216] Spark thriftserver fails when work with hive-maprdb json table
9859ca9 2018/05/30 MapR [SPARK-214] Hive-2.1 poperties can't be read from a hive-site.xml as Spark uses Hive-1.2
eb8710e 2018/05/31 Mapr [SPARK-248] MapRDBTableScanRDD fails to convert to Scala Dataframe when using where clause
2dc24ef 2018/06/13 [MAPR-31632] RM UI showing broken page for Spark jobs
aa624b7 2018/07/18 [WEBUI] Avoid possibility of script in query param keys
c5af6d1 2018/08/02 MapR [SPARK-300] Update hive dependencies for spark 2.2.1
759cdaf 2018/08/02 MapR [SPARK-297] Empty values are loaded as non-null
eedbccc 2018/08/02 MapR [32014] Spark Consumer fails with java.lang.AssertionError
f697fdd 2018/08/03 MapR [SPARK-297] Added unit test for empty value conversion
90e2f7c 2018/08/07 MapR [SPARK-281] Spark configure.sh -R is ignoring custom security and overriding hive-site.xml
f1ca279 2018/08/08 [SPARK-302] Local privilege escalation
1d8577d 2018/08/19 [MAPR-32167] - SparkSQL queries fails with org.apache.spark.sql.catalyst.errors.package$TreeNodeException after upgrade
7c0017b 2018/08/28 [SPARK-16986][WEB-UI] Converter Started, Completed and Last Updated to client time zone in history page
fe84d4b 2018/08/30 MapR [SPARK-279] Can't connect to spark thrift server with new Spark and Hive packages

Known Issues

  • You cannot connect to a Spark Thrift Server on a Kerberos-secured cluster as Kerberos and SSL are not compatible.

    Workaround: Modify the hive.server2.use.SSL to false in the hive-site.xml file.

  • When you install a secure (MapR-SASL) cluster using the MapR Installer, the configure.sh script configures Hive after Spark. As a result, Spark copies the wrong hive-site.xml file and the Spark and Hive integration may not work correctly and you may have problems connecting to Spark beeline.

    Workaround: Check the hive-site.xml file in the Spark home directory, and, if needed, rerun the configure.sh script or copy the hive-site.xml file from your Hive home directory and restart services.

  • Spark versions up to and including 2.3.0 have the following security vulnerability:

Resolved Issues

None.