MapR 5.0 Documentation : Upgrade Spark on YARN


If you installed Spark with the MapR Installer, use the latest version of the MapR Installer to perform the upgrade. 

The following instructions explain how to upgrade an existing installation of Spark 1.x. Spark will be installed in a new subdirectory under /opt/mapr/spark.

  1. Update repositories.
    MapR's rpm and deb repositories always contain the Spark version recommended for the MapR core release associated with that repository.  You can connect to an internet repository or prepare a local repository with any version of Spark you need. You can also manually download packages. 

    If you plan to install from a repository, complete the following steps each node where Spark is installed:

    1. Verify that the repository is configured correctly. See Preparing Packages and Repositories for information about setting up your ecosystem repository. 
    2. Update the repository cache.

      On RedHat and CentOS...

      yum clean all

      On Ubuntu...

      apt-get update

  2. Back up any custom configuration files in your Spark environment. These cannot be upgraded automatically. For example, if Spark SQL is configured to work with Hive, copy the /opt/mapr/spark/spark-<version>/conf/hive-site.xml file to a backup directory.
  3. Shut down the spark-historyserver services (if the spark-historyserver is running):

    maprcli node services -nodes <node-ip> -name spark-historyserver -action stop
  4. Install the Spark packages.

    On Ubuntu...
     apt-get install mapr-spark mapr-spark-historyserver
    On RedHat / CentOS...
    yum update mapr-spark mapr-spark-historyserver

    Note: You only need to upgrade the mapr-spark-historyserver if your previous installation included this package.

  5. Run

    /opt/mapr/server/ -R
  6. Migrate any previous configurations.  
    When you upgrade to a newer Spark version, a new spark-<version> folder is created and the old configuration files will not be automatically migrated to the new folder. Therefore, you can migrate the custom configuration settings to the configuration files within the new conf directory(/opt/mapr/spark/spark-<version>/conf).
    For example, if you previously configured Spark to use the Spark JAR file from a location on the MapR-FS,  you need to copy the latest JAR file to the MapR-FS and reconfigure the path to the JAR file in the spark-defaults.conf file. See Configure Spark JAR Location.
  7. Start spark-historyserver services (if installed):

    maprcli node services -nodes <node-ip> -name spark-historyserver -action start