4 min read
In a typical Hive installation with metadata in a MySQL configuration, a password is configured in a configuration file in clear text. This presents a few risks: 1) Unauthorized access could destroy/modify Hive metadata and disrupt workflows. A malicious user could alter Hive permissions or damage metadata. 2) This password permits hiveserver2-thrift-MySQL communication. To avoid this problem, you should use a Java keystore file instead of a password.
Minimum action needed:
However, this means that users other than mapr cannot run jobs local to the machine, as they can’t read hive-site.xml
By setting up a CredentialProvider to handle storing/retrieval of passwords, you can remove the need to keep the Hive metastore password in cleartext in the Hive configuration by pointing the configuration to use an encrypted keystore file. In this blog post, I’ll explain how to accomplish this.
Create CredentialProvider to encrypt the password and store it as a Java keystore file.
[root@ip-10-0-10-219 ~]# hadoop credential create javax.jdo.option.ConnectionPassword -provider jceks://maprfs/user/mapr/hive.jceks Enter password: Enter password again: javax.jdo.option.ConnectionPassword has been successfully created. org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated.
You will have to enter Hive meta database credentials when it prompts for the password. NOTE: In the MapR Distribution, you will have to enter maprfs in the HDFS path. A direct path like jceks://user/mapr/filename.jceks will not work. Instead, you should enter something like: jceks://maprfs/user/mapr/filename.jceks
Make sure your password alias is created properly by listing it as a JCEKS provider file. The following command can be used for confirmation.
For Listing details of alias [root@ip-10-0-10-219 ~]# hadoop credential list -provider jceks://maprfs/user/mapr/hive.jceks Listing aliases for CredentialProvider: jceks://maprfs/user/mapr/hive.jceks javax.jdo.option.connectionpassword
The results should list javax.jdo.option.connection (alias) from the provider file.
The Hive configuration file will have to be updated with the CredentialProvider location in order for the Hive metadatabase to connect.
Edit hive-site.xml and the following changes:
Restart Hive Metastore Server/HiveServer2 to make the configuration effective.
maprcli node services -name hivemeta -action stop -nodes hivemetastorehostname maprcli node services -name hs2 -action stop -nodes hiveserver2hostname maprcli node services -name hivemeta -action start -nodes hivemetastorehostname maprcli node services -name hs2 -action start -nodes hiveserver2hostname
Verify and validate the logs to make sure that the Hive server is starting fine with no errors.
Log on to Hive and list your databases and tables to make sure they are accessible.
NOTE: Read permission (644) is required for the Java Keystore file for all the Hive users; they need read access to use the Hive shell.
Providing read access to users group is not a fully secured option, people with read access can use keystore file to get DB access. Opening it up to end users is totally necessary for hive shell access then you should at least limit the number of endusers that could access it, and create a separate group for HIVE users and restrict keystore file access(640) accordingly.
For more information, please read the Hadoop Commands Guide.
In this blog post, you learned how to remove the need to keep the Hive metastore password in cleartext in the Hive configuration by pointing the configuration to use an encrypted keystore file. If you have any further questions, please ask them in the comments section below.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.