Configuring Drill

Drill is highly configurable. This document focuses on MapR-related configurations and refers to the open source Apache Drill documentation for generic information. Key things to configure are:
Drill memory
Determine the amount of heap and direct memory allocated to a Drillbit for query processing in a Drill cluster. See Configuring Drill Memory.
Parquet block size
Change the Parquet block size to match the MapR file system chunk size. See Configuring the Parquet Block Size.
Resources for a shared drillbit
Configure queues and parallelization for supporting multiple users sharing a drillbit. Support separate drillbits running on different nodes in the cluster. See Configuring Resources for a Shared Drillbit.
Multitenancy
Configure a multitenant cluster to account for resources required for Drill. See Configuring a Multitenant Cluster.
User Impersonation
Configure impersonation to allow a service to act on behalf of a client while performing the action requested by the client. See User Impersonation.
User authentication and encryption
Configure user authentication when you want the identity of a user proven before the user accesses a process running on a system. See MapR Security (Tickets).
SSL/TLS for Encryption
Enable and configure SSL/TLS for encryption when you need to use Plain authentication. See Using SSL/TLS for Encryption.
Drill impersonation with Hive authorization
Configure Drill impersonation to work with Hive impersonation to authorize access to metadata in the Hive metastore repository and data in the Hive warehouse. See User Impersonation.
Volumes to use for spooling
Use the drill.exec.sort.external.spill.directories option to set MapReduce volumes or local volumes for spooling to improve performance and stripe data across as many disks as possible.
Persistent configuration storage
See Persistent Configuration Storage and Configuring the ZooKeeper PStore Location.
Access rights
Configure access rights if you have 777 file-level permissions to a table and a query returns no results. See Configuring Access Rights below.

Drill typically runs along side other workloads, including the following:

  • MapReduce
  • Yarn
  • Hive and Pig
  • Spark

You need to plan and configure these resources for use with Drill and other workloads:

  • Memory
  • CPU
  • Disk

Configuring Access Rights

If the security in your organization limits access to MapR Database tables, you might experience a problem querying the tables. If you have 777 file-level permissions to a table, yet a query returns no results, you might need to add your user name to the maprcli access list (ACL).