MapR 5.0 Documentation : Installing Apache Drill 0.5 on MapR

Installation Overview

You can install and run Apache Drill on any number of nodes in your Hadoop cluster. Install the mapr-drill package on each node that you want to run Apache Drill. The mapr-drill package installs the Drillbit daemon and the Drill shell.

The Drillbit daemon is the core Drillbit service that runs on a node. Drill’s processing capacity increases with the number of Drillbit services running in a cluster. Each node running the Drillbit service can receive, plan, and execute queries sent from a client. Warden manages the Drillbit service, which simplifies the installation process and management of Apache Drill. For more information about Warden, refer to Warden in the MapR Architecture Guide.

The Drill shell is the command line interface, a pure-Java console-based utility, for connecting to relational databases and executing SQL commands.

After you install Apache Drill, you can perform any of the following tasks:

Verify that your system meets all of the prerequisites before you install Apache Drill.

Prerequisites

To successfully install and run Drill, verify that the system meets all of the prerequisites in the following table: 

PrerequisiteRequirements

Oracle Java SE Development Kit

JDK 1.7
Operating System

MapR provides packages for the following 64-bit operating systems:

  • Red Hat 6.1-6.5, 7
  • CentOS 6.1-6.5, 7
  • Ubuntu 12.04, 14.04
  • SuSE 11SP2
MapR Distribution for Hadoop

MapR distribution version 3.1.1 or 4.0.1. Verify that you have added the MapR repository on your system. You should have the maprtech.repo in the directory /etc/yum.repos.d/ with the following content:

[maprtech]
name=MapR Core Components
baseurl=http://package.mapr.com/releases/<version>/<system>
enabled=1
gpgcheck=0
protect=1

[maprecosystem]
name=MapR Ecosystem Components
baseurl=http://package.mapr.com/releases/ecosystem-4.x/<system> (or .../ecosystem/<system>)
enabled=1
gpgcheck=0
protect=1

Note: The ecosystem URL differs for version 3.1.1 and 4.0.1.

  • http://package.mapr.com/releases/ecosystem-4.x/<system> (for MapR Version 4.x)
  • http://package.mapr.com/releases/ecosystem/<system> (for MapR Version 3.1.1)

For more information, refer to Installing MapR Software-Preparing Packages and Repositories.

Hive (optional)

0.12

HBase (optional)

You can run Drill against the following versions of HBase:

  • 0.94.17
  • 0.94.21

You cannot run Apache Drill 0.5.0 against HBase 0.98.x. If you install HBase packages from MapR's ecosystem-4.x or ecosystem-all repositories, you currently get HBase 0.98.x, by default. You must uninstall HBase 0.98.x and then install HBase version 0.94.17 or 0.94.21.

Refer to the Apache Drill 0.5.0 Release Notes for a list of known issues.

Installing Apache Drill

Complete the following steps to install Apache Drill: 

  1. Issue the following command to install the mapr-drill package on a node:

    RedHat and CentOS
    $ sudo yum install mapr-drill
    Ubuntu
    $ sudo apt-get install mapr-drill
    SuSE
    $ sudo zypper install mapr-drill
  2. (Optional) If you want to run Apache Drill against MapR-DB, you must include the hbase.scan.sizecalculator.enabled property in the drill.exec block of the /opt/mapr/drill/drill-<version>/conf/drill-override.conf file, and set the property to false

    Example
    drill.exec: {
      cluster-id: "my.cluster.com-drillbits",
      zk.connect: "centos21:5181,centos22:5181,centos23:5181",
      hbase.scan.sizecalculator.enabled: "false"
    }

    Do not include the property in drill-override.conf if you run Apache Drill against HBase. 

    Apache Drill does not require working sets to fit in memory at query execution time. Drill's default memory settings should suffice for most use cases. Depending on your use case, Drill performance may significantly benefit from more memory. Please modify Drill's memory settings, as well as initial and maximum heap sizes appropriately, in /opt/mapr/drill/drill-<version>/conf/drill-env.sh.

  3. Run configure.sh to refresh the node configuration.

    Example
    /opt/mapr/server/configure.sh -R 
  4. Verify that the Drillbit service is running on the node. 

    1. You can issue the following command to verify the status of the Drillbit service from the command line:

      jps

      You should see Drillbit as one of the services listed. If you do not see Drillbit in the list, you can issue the following maprcli command to start the Drillbit service on the node:

      maprcli node services -name drill-bits -action start -nodes <node_ip_address>
    2. You can login to the MCS at https://<ip_address>:8443 to verify the status of the Drillbit service:
  5. Repeat the installation process on any other nodes that you want to run Apache Drill. 

When you have Apache Drill installed and running on selected nodes, you can perform any of the following tasks:

Attachments: