MapR 5.0 Documentation : Configure General Hue Settings

The following section includes the following general configuration setting: 

Establish Communication between Hue and JobTrackers or ResourceManagers 

If you run MRv1 jobs, integrate Hue with the JobTacker to view the status of jobs in the Job Browser. If you run YARN applications, integrate Hue with the ResourceManager to view the status of applications in the Job Browser. 

Integrate Hue and JobTrackers

  1. If your cluster runs MRv1, each JobTracker node requires the Hue plug-in so that Hue can communicate with all JobTrackers.
    To copy the Hue plug-in (which is a .jar file) to your MapReduce lib directory on all the nodes running JobTracker, enter:

    cp /opt/mapr/hue/hue-<version>/desktop/libs/hadoop/java-lib/hue-plugins-*.jar /opt/mapr/hadoop/hadoop-0.20.2/lib/

    If JobTracker is running on a different host (not localhost), use the scp command to copy the hue-plugins-*.jar file to the JobTracker host.

  2. In the hadoop section of the hue.ini file, complete the following steps:

    If you installed Hue with the MapR Installer, this step is not required as it is performed automatically.

    1. Set the webhdfs_url to be the node that runs HttpFS.

    2. Set jobtracker_port= <port where the JobTracker IPC listens, default is 9001)

    3. Set submit_to=True

       

  3. In the ha section, configure the jobtracker_host to be the host on which you are are running the failover JobTracker.

    MRv1 example
    [hadoop]
    
    # Use WebHdfs/HttpFs as the communication mechanism.
    # This should be the web service root URL.
    # The ip_address corresponds to the node running httpfs.
    webhdfs_url=http://<ip_address>:14000/webhdfs/v1
    
    jobtracker_host=<ip_address_of_active_JobTracker_node>
    # The port where the JobTracker IPC listens on
    jobtracker_port=9001
    submit_to=True
     
     [[[ha]]]
          # Enter the host on which you are running the failover JobTracker
          jobtracker_host=localhost-ha
  4. For all nodes running JobTracker, provide the JobTracker Thrift port in mapred-site.xml as shown:

    <property>
      <name>jobtracker.thrift.address</name>
      <value>0.0.0.0:9290</value>
    </property>
     
    <property>
      <name>mapred.jobtracker.plugins</name>
      <value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
      <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>
  5. Restart each JobTracker. You can list multiple IP addresses as a space-separated list.

    maprcli node services -jobtracker restart -nodes <ip_addresses>
  6. Confirm that the plug-ins are running correctly by issuing the tail command after you restart JobTracker. Sample output is shown below:

     

    $ tail --lines=500 /opt/mapr/hadoop/hadoop*/logs/*jobtracker*.log|grep ThriftPlugin
    2013-09-26 15:02:39,337 INFO org.apache.hadoop.thriftfs.ThriftPluginServer: Starting Thrift server
    2013-09-26 15:02:39,419 INFO org.apache.hadoop.thriftfs.ThriftPluginServer: Thrift server listening on 0.0.0.0:9290
  7. Verify that JobTracker started and can connect to the thrift plugin port, by entering the following command:

     

    lsof -i:9290

     

    The output from this command should look similar to this:

     

    COMMAND   PID   USER   FD    TYPE   DEVICE SIZE/OFF NODE NAME
    java    10308   mapr   111u  IPv4   18538352    0t0  TCP *:9290 (LISTEN)

     

    You can also check JobTracker logs to verify that JobTracker started. An example log location is shown here:
    /opt/mapr/hadoop/hadoop-0.20.2/logs/hadoop-mapr-jobtracker-*.log
  8. Perform any additional Hue configurations and then restart Hue so that the changes will take effect. See Starting the Hue Webserver

Integrate Hue and ResourceManagers 

If you installed Hue with the MapR Installer, steps 2-4 are not required as they are performed automatically.

  1. If high availability for the Resource Manager is not configured, set the default_jobtracker_host to the ResourceManager host and port in the desktop section.

  2. In the hadoop section, set webhdfs_url to be the node that runs HttpFS.
  3. In the yarn_clusters section within the hadoop section, set the submit_to value to True.

  4. In the mapred_clusters section, set the submit_to value to False.
  5. For versions prior to Hue 3.7-1505: If you run YARN applications, such as MRv2, you must also complete the following steps in the yarn_clusters section:
    • Set the hostname and port number for the ResourceManager.

    • Set the security_enabled value to False

    • Supply the URL for the ResourceManager API.

    • Supply the URL for the proxy API.

    • Supply the URL for the HistoryServer API.

    Note: As of Hue 3.7-1505, Hue automatically determines these values.

The changes are summarized in the following example hue.ini files: 

YARN Example
[desktop]

...
# Set the default JobTracker host to maprfs to enable HA for JobTracker.
# If there is a standby JobTracker, it will be found automatically.
# In the event of failover, Hue will submit queries to the standby JobTracker.
  default_jobtracker_host=maprfs:///
 
...
 
[hadoop]

# Use WebHdfs/HttpFs as the communication mechanism.
# This should be the web service root URL.
# The ip_address corresponds to the node running httpfs.
webhdfs_url=http://<ip_address>:14000/webhdfs/v1
 
[[yarn_clusters]]
  [[[default]]]
     # Enter the host on which you are running the ResourceManager
       resourcemanager_host=localhost

      # The port where the ResourceManager IPC listens on
       resourcemanager_port=8032

      # Whether to submit jobs to this cluster
      submit_to=True

     ...
 
[[mapred_clusters]]
    [[[default]]]
     
 # Whether to submit jobs to this cluster
      submit_to=False 

Enable User Impersonation for Hue

To enable Hue to submit requests on behalf of any other user, complete the following steps:  

  1. Verify or configure the following lines to the /opt/mapr/hadoop/hadoop-<version>/conf/core-site.xml file for all nodes running JobTracker or ResourceManager:

    <property>
      <name>hadoop.proxyuser.<default_user>.hosts</name>
      <value>*</value>
    </property>
     
    <property>
      <name>hadoop.proxyuser.<default_user>.groups</name>
      <value>*</value>
    </property>
  2. To enable the Hue file browser to view files in the MapR-FS,  add the following proxy user settings in the configuration block of the httpfs-site.xml :

    <!-- Hue HttpFS proxy user setting -->
    <configuration>
      <property>
        <name>httpfs.proxyuser.<default_user>.hosts</name>
        <value>*</value>
      </property>
     
      <property>
        <name>httpfs.proxyuser.<default_user>.groups</name>
        <value>*</value>
      </property>
    </configuration>
  3. Perform any additional Hue configurations and then restart Hue so that the changes will take effect. See Starting the Hue Webserver
In most cases, mapr is the <default_user>.  The <default_user> you specify must also be the default_user that is configured in the [desktop] section of the hue.ini. 
Based on the ecosystem components that you want to use, addition configuration may be required.

Disable an Application (optional)

If you want to disable an application (such as Impala), follow these steps:

  1. In the [desktop] section of the hue.ini file, uncomment the # app_blacklist= statement and insert the name of the app you want to disable (impala in this example).

    # Comma-separated list of apps not to load at server startup.
    # Note that rdbms is the name used for dbquery.
    app_blacklist=spark,zookeeper,search,impala,sqoop,rdbms
    Do not remove search from the app_blacklist. The Hue UI will not work if the search application is enabled.
  2. Perform any additional Hue configurations and then restart Hue so that the changes will take effect.  See Starting the Hue Webserver

 You can re-enable a blacklisted application at any time, and then restart Hue.

Change the File Size Restriction for the File Browser (optional)

The Hue File Browser will not open files that are 1.0 GB or greater.

If you want to change file size restriction, follow these steps:

  1. In the [[hdfs_clusters]] section of the hue.ini, edit the value of the file_size property. 

    # File size restriction for viewing file (float)
    # '1.0' - default 1 GB file size restriction
    # '0' - no file size restrictions
    # >0  - set file size restriction in gigabytes, ex. 0.5, 1.0, 1.2...
    file_size=<value>
  2. Perform any additional Hue configurations and then restart Hue so that the changes will take effect.  See Starting the Hue Webserver. .