Spark 2.4.4.0-1912 Release Notes

This section provides reference information, including new features, patches, and known issues for Spark 2.4.4.0.

The notes below relate specifically to the MapR Distribution for Apache Hadoop. For more information, you may also wish to consult the open-source Spark 2.4.4 Release Notes.

These release notes contain only MapR-specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.

Spark Version 2.4.4.0
Release Date December 2019
MapR Version Interoperability See Component Versions for Released MEPs and MEP Components and OS Support.
Source on GitHub https://github.com/mapr/spark
GitHub Release Tag 2.4.4.0-mapr-630
Maven Artifacts http://repository.mapr.com/maven/
Package Names Navigate to https://package.mapr.com/releases/MEP/ and select your MEP and OS to view the list of package names.
Important:
  • Beginning with MEP 6.0.0, the keyStore and trustStore password can be removed from spark-defaults.conf and set in /opt/mapr/conf/ssl-client.xml.
  • Beginning with MEP 6.0.0, after an upgrade, the previous version's configuration files are saved in the /opt/mapr/spark directory.
  • MapR 6.1.0 with MEP 6.0.0 and later support simplified security. If you enable security on your MapR cluster, MapR scripts automatically configure Spark security features.
  • Beginning with MEP 6.3.0, the Spark MapRDB JSON connector supports secondary indexes.
  • Beginning with MEP 6.3.0, Spark supports configurable HTTP security headers.

Hive Support

This version of Spark supports integration with Hive. However, note the following exceptions:

New in This Release

Patches

This MapR release includes the following new patches since the latest MapR Spark release. For details, refer to the commit log for this project in GitHub.

GitHub Commit Date (YYYY-MM-DD) Comment
47147db 2019/09/20 MapR [SPARK-609] Port Apache Spark-2.4.4 changes to the MapR Spark-2.4.4 branch
b41bbe0 2019/09/25 MapR [SPARK-614] Error in sparkR while reading avro and parquet file formats
ca0c9c4 2019/10/07 MapR [SPARK-618] Update hive dependencies for spark-2.4.4 to 2.3.6 version
2827bed 2019/10/07 MapR [SPARK-619] Move absent commits from 2.4.3 branch to 2.4.4
118c6c5 2019/10/09 MapR [SPARK-620] Replace core dependency in Spark-2.4.4
c996bb0 2019/10/09 MapR [SPARK-595] Spark cannot access hs2 through zookeeper
e758a24 2019/10/11 MapR [SPARK-621] Add custom http header support. Improve work with security headers.
4000048 2019/10/16 MapR [SPARK-617 Can't use ssl via spark beeline
d3a0ec5 2019/10/22 MapR [SPARK-340] Jetty web server version at Spark should be updated to v9.4.X
c7e076e 2019/10/22 MapR [SPARK-626] Update kafka dependencies for Spark 2.4.4.0 in release MEP-6.3.0
c5cbbcc 2019/10/23 MapR [MS-925] After upgrade to MEP 6.2 (Spark 2.4.0) can no longer
5eaced8 2019/11/07 MapR [SPARK-629] Spark UI for job lose CSS styles
b0d5ee9 2019/11/11 MapR [SPARK-639] Default headers are adding two times
c99e9c9 2019/11/15 MapR [SPARK-627] SparkHistoryServer-2.4 is getting 403 Unauthorized home page for users(spark.ui.view.acls) via spark-submit

Known Issues

  • MapR [SPARK-593], MapR [SPARK-558] - A Spark job can hang and the job output can be redirected to the /opt/mapr/logs/pam.log file if you use the spark-shell during login to the Spark Driver UI or if you try to open the Spark Web UI before it is initialized.
  • MapR [SPARK-573] - A Spark job on a standalone node fails via the mapr-client. This happens because the spark-defaults.conf file can't be configured by Spark configure.sh because core configure.sh doesn't call it. Workaround: Two workarounds are possible:
    • Copy the spark-defaults.conf file from /opt/mapr/spark/spark-<version>/conf/ into the same folder on the client node.
    • Run Spark configure.sh directly.
    The first workaround is more secure and stable, but both workarounds can be unreliable:
    • In some cases, copying the spark-defaults.conf file may be not enough.
    • Spark configure.sh is not documented for external use. In addition, Spark configure.sh is run implicitly by core configure.sh, and running it directly with the wrong commands can break the Spark configuration

Resolved Issues

  • None.