The metadata for Hive tables and partitions are stored in the Hive Metastore (for more information, see the Hive project documentation). By default, the Hive Metastore stores all Hive metadata in an embedded Apache Derby database in MapR-FS. Derby only allows one connection at a time; if you want multiple concurrent Hive sessions, you can use MySQL for the Hive Metastore. You can run the Hive Metastore on any machine that is accessible from Hive.
Configuring Hive for MySQL
hive-site.xmlin the Hive configuration directory (
/opt/mapr/hive/hive-<version>/conf) with the following contents:
- To connect to an existing MySQL metastore, make sure the
ConnectionURLparameter and the
Thrift URIsparameters in
hive-site.xmlpoint to the metastore's host and port.
To set a specific port for Thrift URIs, add the command
export METASTORE_PORT=<port>into the file
hive-env.shdoes not exist, create it in the Hive configuration directory). Example:
Start the Hive Metastore service using one of the following command:
If you want the Hive Metastore to be managed by Warden, the maprcli, and the MCS:
If you want the Hive Metastore to be managed with standard hive commands:
You can use also use
nohup hive --service metastoreto run metastore in the background.
If you have not configured a MySQL Metastore, do not run the Hive shell from a MapR NFS mount location. If you try to do this, Hive will fail. The same problem will occur if you use the
hive-site.xml file to configure the Metastore on a MapR NFS mount location. Avoid both of these configurations.