8 min read
Today, we are extremely excited and proud to announce the general availability (GA) of Apache Drill 1.0, as part of the MapR Distribution. Congratulations to the Drill community on this significant milestone and achievement!
Incubated in September 2012 as an Apache project, Drill started with an ambitious goal to provide a low-latency SQL engine for the modern big data era by combining the familiarity of relational databases with the scale and agility of Hadoop/NoSQL systems. With nearly two and half years of solid engineering effort building this next generation SQL technology, and backed by a strong community of ~50 contributors and thousands of users and customers across various industries, Apache Drill has quickly evolved to become the most flexible SQL query engine in the big data ecosystem since the Beta release in September 2014. Built from the ground up to support interactive queries on a variety of complex/multi-structured datasets at TB/PB scale, Drill opens up new frontiers for the innovation required to make Hadoop and big data accessible for a broader set of users in a self-service fashion, leveraging the ANSI SQL skill sets and tools already abundant as part of an organization’s BI/analytics infrastructure.
A look back at the journey
Here is a quick snapshot of the momentum of the Apache Drill project along with some notable milestones along the way. The project has been on the fast track in the last nine months since the developer preview in August 2014, delivering seven significant iterative releases, each adding exciting new features and most importantly, improving on the stability, scale, and performance required for broader enterprise deployments. Overall, 2200+ JIRAs have been resolved in this effort. Thanks to the excellent progress, numerous customers have used and experienced the value of Drill and have rolled it into production. Drill also graduated as an Apache Top-Level Project during the journey, and is recognized as the top-rated SQL-on-Hadoop technology by industry analysts.
Now what is Drill all about?
Drill is all about providing flexibility without compromising performance. There are two core aspects critical to achieving this:
Second, Drill is a scale-out and columnar execution engine designed for low latency queries on petabytes of data. Depending on an organization’s requirements around the number of users to support, and the amount of data to query and SLA requirements to meet, Drill can scale from a single node to thousands of nodes.
With Drill , now users can get to the data faster in just minutes, rather than endure weeks and months of data preparation/ETL cycles, and users can open up new, complex/multi-structured data that they couldn’t get to before—all by leveraging the ANSI SQL skill sets/BI tools available in the organization.
Here’s a quote from an industry analyst that we believe precisely summarizes the value of Drill:
“Drill isn’t just about SQL-on-Hadoop. It’s about SQL-on-pretty-much-anything, immediately, and without formality.” - Andrew Burst, Gigaom Research, January 2015
Features at a glance
Here is a brief list of Apache Drill features available in the 1.0 release:
For the Drill 1.0 only specific improvements, refer to the Apache release blog post here.
Drill expands the spectrum of BI use cases by providing the ability to get value from all of the raw datasets available in organizations, wherever it is. The ability to explore and ask ad hoc questions on full fidelity data—in its native format as it comes in—is what sets Drill apart from traditional SQL technologies, which only solve part of the puzzle by working with only centrally-structured data. The BI/Analytics use cases that Drill enables include self-service raw data exploration and complex IoT/JSON data analytics, as well as ad hoc queries on Hadoop-powered enterprise data hubs.
The road ahead
1.0 GA is just the beginning for the next phase of the journey. With the solid foundation paved with the GA release, the Drill community is planning to add new, exciting features in a variety of areas such as JSON, complex data functions, new file formats and SQL. The project will also continue the momentum of iterative releases every 4-6 weeks going forward. For a detailed roadmap that shows what’s coming in 2015, please refer to this blog post.
Getting started with Drill is extremely easy, and there are numerous resources available to help. Here are some useful links:
Congratulations again to the Drill community on this accomplishment. We’re looking forward to continued game-changing innovation that is shaping the future of scalable big data access in enterprises.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.