7 min read
In this blog I will show you how set up authentication for HiveServer2 (HS2) using pluggable authentication module (PAM). Once configured, all HS2 clients (JDBC and ODBC) will require a valid username and password to connect. A validation error will be thrown if an invalid username and password is passed. This authentication doesn’t apply to
hive cli (command line interface) as it doesn’t go through HS2. Please remember that HS2 authentication only controls connection to hive and not the actual data. Data stored on Hadoop cluster is still authorized using file system permissions. The identity used depends on whether impersonation is enabled.
1. If your organization relies heavily on LDAP, you can also use LDAP authentication to control access to HS2. LDAP authentication configuration is not covered in this blog.
2. This blog is based on Hive-0.11 released by MapR. The default location for hive-site.xml is /opt/mapr/hive/hive-0.11/conf
3. This blog is based on MapR 3.0.2 release with M5 license installed. In MapR 3.1.0 security release the authentication is enabled by default and you do not need to configure it
I highly recommend installing HS2 and hive METASTORE on either a separate control node where
mapr-core package is installed or on one of the data node. This will bring HS2 and METASTORE under mapr-warden’s control and will be managed by warden whenever mapr-warden is stopped/started/restarted. Also, maprcli is part of mapr-core package that allows you to manage individual services running on a host. Running HS2 and METASTORE on a mapr client node doesn’t give you flexibility of
I also recommend enabling user impersonation for HS2. User impersonation enables HS2 to submit jobs as a particular user. Without impersonation, HS2 submits jobs as the user that started the HiveServer2 process. On a MapR cluster, this user is typically the mapr user or the user specified in the
MAPR_USER environment variable. To enable impersonation please follow MapR installation documentation.
To set up HS2 authentication, perform following steps:
HS2 authentication is configured using three parameters defined in hive-site.xml. These parameters are:
This parameter defines the authentication mode that HS2 is going to use while authenticating username and password. Four options
KERBEROS, LDAP and
CUSTOM are supported. For the purpose of this blog we are going to use
When hive.server2.authentication is set to CUSTOM you must specify the authentication class explicitly. In case of MapR installation this value will be org.apache.hive.service.auth.PamAuthenticationProvider
This parameter defines a comma separated list of pam modules that will be used for verification of username and password. In my configuration I am using
sudo as these are the defaults and the PAM module that we use for password authentication. The values for this parameter are
sshd, sudo. To make these changes open
/opt/mapr/hive/hive-0.11/conf/hive-site.xml and add these parameters.
Below is the sample configuration:
Optionally, you can also enable SSL to protect user id and password. Please follow MapR installation documentation to enable SSL for Hive.
libjpam.sois installed in correct location
The most important piece to enable HS2 authentication is
libjpam.so library file. MapR installation automatically installs the libjpam.so file in correct location. In case you are running HS2 on a node that only has mapr-client package installed and the library file is missing, you can take it from one of the data node and copy it to default location. I highly recommend contacting MapR support if you are not able to get this file yourself.
|Note: The default location for libjpam.so is /opt/mapr/hadoop/hadoop-0.20.2/lib/native/Linux-amd64-64/ for a 64 bit installation and /opt/mapr/hadoop/hadoop-0.20.2/lib/native/Linux-i386-32/ for 32 bit installation.|
Restarting HS2 depends on how it is installed. HS2 can be installed using following two modes:
Warden Managed: In this mode HS2 process is completely managed by
mapr-warden service. This is the recommended method and requires HS2 to be installed on either one of the data node in your cluster or a separate control node that has
mapr-core package installed. Use following command to restart HS2:
maprcli node services –nodes <node list> -name hiveserver2 –action restart
maprcli node services –nodes host01 –name hiveserver2 –action restart
Unmanaged: In unmanaged mode HS2 is not managed by mapr-warden service and requires manual intervention to stop/start/restart HS2. Following two methods can be used under unmanaged mode.
/opt/mapr/hive/hive-0.11/ bin/hive --service hiveserver2
“jps –m”command or
“ps –ef | grep hiveserver2”
kill -9 2343
/opt/mapr/hive/hive-0.11/ bin/hive --service hiveserver2
initscripts for HS2 process then you can use service command to restart the HS2 process.
# service <init script name> restart
# service hive-server2 restart
Test HS2 Authentication
JDBC Client BEELINE
Start the beeline command line
Issue the connect command:
You will be prompted for a username and password. If you enter an invalid credentials you will see an error similar to given below:
If you enter valid credentials, you will be connected successfully.
ODBC Client using MapR ODBC Connector for Hive
To test HS2 authentication using ODBC driver you need to perform following steps:
Note: I assume that you have already downloaded and installed MapR Hive ODBC Connector. If not, please follow MapR installation documentation on how to get HIVE ODBC connection and install it.
Go to Start -> All Programs -> MapR ODBC Hive Connector 2.0 (32 Bit) -> 32 Bit ODBC Driver Manager
Note: If you have installed 64 Bit driver you will see 64 Bit instead of 32 Bit.
Click on “Add…” button and select MapR Hive ODBC Connector from the list of available drivers and then click “Finish”.
Please enter all the information:
If you were able to connect through BEELINE, you should not have any problem through ODBC as well.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.