Apache Drill is a distributed system for interactive analysis of large-scale datasets. Drill is similar to Google’s Dremel, with the additional flexibility needed to support a broader range of query languages, data sources and data formats, including nested, self-describing data.
Drill offers the following benefits:
MapR is a recognized as the leading Hadoop innovator and is dedicated to providing the best big data processing capabilities. MapR is committed to a highly transparent, open source project so that the best architecture can be put in place to ensure a high quality and flexible solution. This includes developing and defining open APIs to ensure a robust ecosystem. Apache Drill represents a huge leap forward for organizations looking to augment their big data processing with interactive queries across massive data sets, with a focus on schema-less and nested data which is an unmet need in the SQL-on-Hadoop market today. Driving Drill as an open source project reduces the barriers to adopting a new set of big data APIs.
Drill provides a distributed execution engine for interactive queries. HBase represents a supported data source for Drill.
Today these systems compile higher-level languages (e.g., HiveQL, Pig Latin) into MapReduce jobs. Once Drill is available, these systems may support Drill as an underlying low-latency execution engine, enabling interactive queries across billions of records. Chris Wensel, the author of Cascading, is collaborating with MapR on this project and is one of the initial committers.