Enable Impersonation for Hive

User impersonation enables Hive to submit jobs as a particular user. Without impersonation, Hive submits queries and hadoop commands as the user that started HiveServer2 and Hive Metastore. On a MapR cluster, this user is typically the mapr user or the user specified in the MAPR_USER environment variable.

Warning: The impersonated user must have write permissions to /user/hive/warehouse and /user/mapr-user/tmp/hive directories.

To Enable User Impersonation:

  1. Set the following properties in the /opt/mapr/hive/<version>/conf/hive-site.xml file on the nodes where HiveServer2 is installed:
    <property>
      <name>hive.server2.enable.doAs</name>
      <value>true</value>
      <description>Set this property to enable impersonation in Hive Server 2</description>
    </property>
    <property>
      <name>hive.metastore.execute.setugi</name>
      <value>true</value>
      <description>
        Set this property to enable Hive Metastore service impersonation in unsecure mode. 
        In unsecure mode, setting this property to true will cause the metastore to execute DFS operations 
        using the client's reported user and group permissions. Note that this property must be set on both 
        the client and server sides. If the client sets it to true and the server sets it to false, the 
        client setting will be ignored.
      </description>
    </property>
  2. Set the following property opt/mapr/hive/<version>/conf/hive-site.xml file on the nodes where Hive Metastore is installed:
    <property>
      <name>hive.metastore.execute.setugi</name>
      <value>true</value>
      <description>
        Set this property to enable Hive Metastore service impersonation in unsecure mode. In unsecure mode, 
        setting this property to true will cause the metastore to execute DFS operations using the client's 
        reported user and group permissions. Note that this property must be set on both the client and server 
        sides. If the client sets it to true and the server sets it to false, the client setting will be 
        ignored.
      </description>
    </property> 
  3. Set the following properties in the /opt/mapr/hadoop/hadoop-<version>/conf/core-site.xml file:
    <property>
      <name>hadoop.proxyuser.mapr.groups</name>
      <value>*</value>
      <description>Allow the superuser mapr to impersonate any member of any group</description>
    </property>
    <property>
      <name>hadoop.proxyuser.mapr.hosts</name>
      <value>*</value>
      <description>The superuser can connect from any host to impersonate a user</description>
    </property>
  4. Create a file at $MAPR_HOME/conf/proxy/<username> for each user to impersonate. For example, to enable HiveServer2 to submit jobs to the MapR cluster as the user juser, run the following command as root on each node where HiveServer 2 is installed:
    # mkdir $MAPR_HOME/conf/proxy
    # chmod 755 $MAPR_HOME/conf/proxy
    # touch $MAPR_HOME/conf/proxy/juser 

To Verify that User Impersonation is Enabled:

  • To verify that Hive queries do not run as the mapr user, connect to hiveserver2 as a user other than mapr. Then, run queries and verify that queries were run as the user that connected to hiveserver2.
  • To verify that hadoop commands submitted by Hive do not run as the mapr user, start the shell or hiveserver2 as a user other than mapr. Then, create some tables and verify that the tables in /user/hive/warehouse are created under the user that started the shell or hiveserver2.

Example: Hive Impersonation

beeline> !connect jdbc:hive2://hostname:10000/default
scan complete in 2ms
Connecting to jdbc:hive2://hostname:10000/default
Enter username for jdbc:hive2://hostname:10000/default: userfoo
Enter password for jdbc:hive2://hostname:10000/default: ****
Connected to: Hive (version 0.11-mapr)
Driver: Hive (version 0.11-mapr)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hostname:10000/default> create table voter_table1(voternum INT,name string,age tinyint,registration string,contributions float,voterzone smallint);
No rows affected (0.22 seconds)
0: jdbc:hive2://hostname:10000/default> load data local inpath '/root/userfoo/hive/voter' into table voter_table1;
No rows affected (0.463 seconds)

To verify the ownership of the example table, run the following command:

[user@host ~]# hadoop fs -ls /user/hive/warehouse/voter_table1
Found 1 items
-rwxr-xr-x   3 userfoo users       8576 2013-10-18 14:48 /user/hive/warehouse/voter_table1/voter