Spark and Spark on YARN 1.3.1-1505-r1 Release Notes

The notes below relate specifically to the MapR Distribution for Apache Hadoop. You may also be interested in the open source Spark 1.3.1 Release Notes.

Version Spark 1.3.1-1505
Release Date June 2, 2015
Source on GitHub https://github.com/mapr/spark
GitHub Release Tag 1.3.1-mapr-1505-r1
MapR Version Compatibility See Ecosystem Support Matrix (Pre-5.2 releases).

New in This Release

This is the initial MapR release of Spark 1.3.1.

For instructions on installing or upgrading to Spark 1.3.1, see the MapR's Spark documentation.

Hive Support

This version of Spark supports Spark-SQL with Hive 0.13. Other versions of Hive, including Hive 1.0, are not supported with Spark-SQL.

Note: Spark-SQL is not fully compatible with Hive; see the Apache Spark documentation for details.

Event Logging in Spark 1.3.1

In both Spark 1.2.1 and Spark 1.3.1, event logging is enabled by default. However, Spark 1.3.1 also checks that the event directory is present on MapR file system (where the logs are written):

maprfs:///apps/spark

Create this directory whether or not the History Server is enabled. Alternatively, disable event logging by setting spark.eventLog.enabled to false.

If you are using MapR Version 4.0.x, applications that are run by a non-mapr user may not be visible in the History Server UI. To work around this problem, manually update the file ownership under /apps/spark to mapr:mapr for those applications:

hadoop fs -chown -R mapr:mapr /apps/spark/app-XYZ

This workaround is not required if you are using MapR Version 4.1.

Known Issues

MAPR-17271:
On secure clusters, the MapR Control System (MCS) does not display links for Spark-Master and Spark-HistoryServer.
MAPR-15970:
If RM HA is set up on the cluster and the AM for the YARN job runs on a node other than the RM node, the AM link in the RM UI returns an error and does not bring up the AM UI. See the MapR's Spark documentation for a workaround.

Patches

This release from MapR includes the following patches on the base Apache release. For complete details, refer to the commit log for this project in GitHub.

Commit Date (YYYY-MM-DD) Comment
53ad194 2015-05-27 MAPR-17554: Spark services can now be stopped during an uninstall.
2c2b216 2015-05-20 MAPR-18750: DFS shuffle-related properties are no longer placed in the spark-defaults.conf file by default.
2c2b216 2015-05-20 MAPR-18360: When Spark is uninstalled, only slaves running on localhost are stopped.
ed8b996 2015-05-27 MAPR-18793: Spark-SQL now works with the Hive Metastore on a secure MapR cluster.
feb4cce    
7c71cf2 2015-05-27 MAPR-18794: Spark users can now access the YARN logs via https when MapR security is enabled.
979a73f 2015-06-01 MAPR-18912: Hadoop 2.5 did not contain a curator jar required by Spark Master HA.