Integrate Hue and JobTrackers

If you run MRv1 jobs, integrate Hue with the JobTacker to view the status of jobs in the Job Browser.

  1. If your cluster runs MRv1, each JobTracker node requires the Hue plug-in so that Hue can communicate with all JobTrackers. To copy the Hue plug-in (which is a .jar file) to your MapReduce lib directory on all the nodes running JobTracker, enter:
    cp /opt/mapr/hue/hue-<version>/desktop/libs/hadoop/java-lib/hue-plugins-*.jar /opt/mapr/hadoop/hadoop-0.20.2/lib/
    Note: If JobTracker is running on a different host (not localhost), use the scp command to copy the hue-plugins-*.jar file to the JobTracker host.
  2. In the hadoop section of the hue.ini file, complete the following steps:
    Note: If you installed Hue with the MapR Installer, this step is not required as it is performed automatically.
    1. Set the webhdfs_url to be the node that runs HttpFS.
    2. Set jobtracker_port=<port> where the JobTracker IPC listens, default is 9001)
    3. Set submit_to=True.
  3. In the ha section, configure the jobtracker_host to be the host on which you are are running the failover JobTracker.
    MRv1 example
    [hadoop]
     
    # Use WebHdfs/HttpFs as the communication mechanism.
    # This should be the web service root URL.
    # The ip_address corresponds to the node running httpfs.
    webhdfs_url=http://<ip_address>:14000/webhdfs/v1
     
    jobtracker_host=<ip_address_of_active_JobTracker_node>
    # The port where the JobTracker IPC listens on
    jobtracker_port=9001
    submit_to=True
     
     [[[ha]]]
          # Enter the host on which you are running the failover JobTracker
          jobtracker_host=localhost-ha
  4. For all nodes running JobTracker, provide the JobTracker Thrift port in mapred-site.xml as shown:
    <property>
      <name>jobtracker.thrift.address</name>
      <value>0.0.0.0:9290</value>
    </property>
      
    <property>
      <name>mapred.jobtracker.plugins</name>
      <value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
      <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>
  5. Restart each JobTracker. You can list multiple IP addresses as a space-separated list.
    maprcli node services -jobtracker restart -nodes <ip_addresses>
  6. Confirm that the plug-ins are running correctly by issuing the tail command after you restart JobTracker. Sample output is shown below:
    $ tail --lines=500 /opt/mapr/hadoop/hadoop*/logs/*jobtracker*.log|grep ThriftPlugin
    2013-09-26 15:02:39,337 INFO org.apache.hadoop.thriftfs.ThriftPluginServer: Starting Thrift server
    2013-09-26 15:02:39,419 INFO org.apache.hadoop.thriftfs.ThriftPluginServer: Thrift server listening on 0.0.0.0:9290
  7. Verify that JobTracker started and can connect to the thrift plugin port, by entering the following command:
    lsof -i:9290
    The output from this command should look similar to this:
    COMMAND   PID   USER   FD    TYPE   DEVICE SIZE/OFF NODE NAME
    java    10308   mapr   111u  IPv4   18538352    0t0  TCP *:9290 (LISTEN)
    You can also check JobTracker logs to verify that JobTracker started. An example log location is shown here:
    /opt/mapr/hadoop/hadoop-0.20.2/logs/hadoop-mapr-jobtracker-*.log 
  8. Perform any additional Hue configurations and then restart Hue so that the changes will take effect. See Starting the Hue Webserver.