Integrate Hue with Spark

You can configure Hue to use the Spark Notebook UI. This allows users to submit Spark jobs from Hue.
Note: Spark Notebook is a beta feature that utilizes the Spark REST Job Server (Livy). The mapr-hue-livy package must be installed on a node were the mapr-spark package is installed or the Livy service will not start.
Complete the following steps as the root user or by using sudo:
  1. In the file, configure SPARK_SUBMIT_CLASSPATH environment variable to include the classpath to the servlet jar before the MAPR_SPARK_CLASSPATH.
  2. In the [spark] section of the hue.ini, set the livy_server_host parameter to the host where the Livy server is running.
    # IP or hostname of livy server.
    Note: If the Livy server runs on the same node as the Hue UI, you are not required to set this property as the value defaults to the local host.
  3. If Spark jobs run on YARN, perform the following steps:
    1. Set livy_server_session_kind to yarn on the node where the Livy server is running.
      # IP or hostname of livy server.
    2. Set the HUE_HOME and the HADOOP_CONF_DIR environment variables in the file (/opt/mapr/hue/hue-<version>/bin/
      export HUE_HOME=${bin}/..export
      Note: If you do not set these environment variables, the following error appears in the Check Configuration page:
      The app won't work without running Livy Spark Server
  4. Restart the Spark REST Job Server (Livy).
    maprcli node services -name livy -action restart -nodes <livy node>
  5. Restart Hue.
    maprcli node services -name hue -action restart -nodes <hue node>
  6. Restart Spark.
    maprcli node services -name spark-master -action restart -nodes <space delimited list of nodes>
Additional Information
  • Note: To access the Notebook UI, select Spark from the Query Editor in the Hue interface.
  • If needed, you can use the MCS or maprcli to start, stop, or restart the Livy Server. For more information, see Starting, Stopping, and Restarting Services.
Note: Troubleshooting Tip
If you have more that one version of Python installed, you may see the following error when executing Python samples:
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe...


Set the following environment variables in /opt/mapr/spark/spark-<version>/conf/

export PYSPARK_PYTHON=/usr/bin/python2.7
export PYSPARK_DRIVER_PYTHON=/usr/bin/python2.7