Installing Hive

This topic includes instructions for using package managers to download and install Hive from the MEP repository.

For instructions on setting up the MEP repository, see Step 8: Install Ecosystem Components Manually
Note: If you are installing Hive 2.1, you can use Tez instead of MapReduce to improve query performance. For more information, See Hive 2.1 and Tez 0.8. A link to the steps for configuring Hive on Tez is provided later in this procedure.
You can install Hive on a node in the MapR cluster or on a MapR client node. Installation of HiveServer2 on a client node is not supported by MapR. If you wish to install HS2 on a client node, note that one or more required JAR files may not be installed during the installation of mapr-client. Copy the following JAR file from a resource manager node to the MapR client node:
/opt/mapr/hadoop/hadoop-<X.X.X>/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-<X.X.X>-mapr-<YYYY>.jar
Here:
X.X.X Refers to the version (for example, hadoop-2.7.0)
YYYY Refers to the release tag of ecosystem component (for example, 1602)
Note: Copying the JAR file may allow you to work with Hive in non-secure mode.

See the Hive Release Notes for a list of fixes and new features.

Hive is distributed as the following packages:

Package Description
mapr-hive The core Hive package.
mapr-hiveserver2 The Hive package that enables HiveServer2 to be managed by the warden, allowing you to start and stop HiveServer2 using maprcli or the MapR Control System. The mapr-hive package is a dependency and will be installed if you install mapr-hiveserver2. At installation time, Hiveserver2 is started automatically.
mapr-hivemetastore The Hive package that enables the Hive Metastore to be managed by the warden, allowing you to start and stop Hive Metastore using maprcli or the MapR Control System. The mapr-hive package is a dependency and will be installed if you install mapr-hivemetastore. At installation time, the Hive Metastore is started automatically.
mapr-hivewebhcat The Hive package that enables WebHCat to be managed by the warden, allowing you to start and stop WebHCat using maprcli or the MapR Control System. The mapr-hive package is a dependency and will be installed if you install mapr-hivewebhcat. At installation time, the WebHCat is started automatically.

Make sure the environment variable JAVA_HOME is set correctly. Example:

# export JAVA_HOME=/usr/lib/jvm/java-7-sun

You can set these system variables by using the shell command line or by updating files such as /etc/profile or ~/.bash_profile. See the Linux documentation for more details about setting system environment variables.

Note: The MapR cluster must be up and running before installing Hive.
Execute the following commands as root or using sudo.
  1. On each planned Hive node, install Hive packages.
    • To install Hive:
      On CentOS / RedHat
      yum install mapr-hive 
      On SUSE
      zypper install mapr-hive
      On Ubuntu
      apt-get install mapr-hive
    • To install Hive and HiveServer2:
      On CentOS / RedHat
      yum install mapr-hive mapr-hiveserver2
      On SUSE
      zypper install mapr-hive mapr-hiveserver2
      On Ubuntu
      apt-get install mapr-hive mapr-hiveserver2
                                 
    • To install Hive, HiveServer2, and HiveMetastore:
      On CentOS / RedHat
      yum install mapr-hive mapr-hiveserver2 mapr-hivemetastore
      On SUSE
      zypper install mapr-hive mapr-hiveserver2 mapr-hivemetastore
      On Ubuntu
      apt-get install mapr-hive mapr-hiveserver2 mapr-hivemetastore
    • To install Hive, HiveServer2, HiveMetastore and WebHCat:
      On CentOS / RedHat
      yum install mapr-hive mapr-hiveserver2 mapr-hivemetastore mapr-hivewebhcat
      On SUSE
      zypper install mapr-hive mapr-hiveserver2 mapr-hivemetastore mapr-hivewebhcat
      On Ubuntu
      apt-get install mapr-hive mapr-hiveserver2 mapr-hivemetastore mapr-hivewebhcat
    Note: If you are using derby as the underlying database to create metastore, do not install mapr-hiveserver2 and mapr-hivemetastore on the same node as mapr-hive. This configuration results in a java run-time exception when you attempt to start the hive CLI. In order to run Hive in embedded mode, do not start HiveServer2 and Hive Metastore; instead, make sure the underlying database is consistent using the schemaTool. In embedded mode, Metastore is used as library with Hive CLI; Hive CLI opens connection and executes all queries against the Metastore DB.
  2. Run configure.sh:
    /opt/mapr/server/configure.sh -R
  3. Set the following environment variables:
    • HIVE_HOME should be set to the Hive installation directory. export HIVE_HOME=/opt/mapr/hive/hive-<version>
      export HIVE_HOME=/opt/mapr/hive/hive-<version>
    • PATH should include $HIVE_HOME/bin. export PATH=$PATH:$HIVE_HOME/bin
      export PATH=$PATH:$HIVE_HOME/bin

    You can set these system variables by using the shell command line or by updating files such as /etc/profile or ~/.bash_profile. See the Linux documentation for more details about setting system environment variables.

After Hive is installed, the executable is located at: /opt/mapr/hive/hive-<version>/bin/hive

Before running Hive queries with HiveServer2, you must perform one of the following tasks as otherwise queries will fail:

To configure Hive on Tez, see Configuring Hive 2.1 and Tez 0.8.