MapR Converges SQL and JSON with Apache Drill v1.6
April 06, 2016
Converged Data Platform unifies MapR Database document database with ANSI SQL engine to unlock insight via industry-standard BI tools
MapR Technologies, Inc., provider of the industry’s only Converged Data Platform, today announced the availability of Apache Drill 1.6 as the unified SQL layer for the MapR Converged Data Platform via tighter integration with MapR-DB. Customers and partners benefit from the flexibility of reporting and analytics on JSON data stored in MapR Database tables, realizing faster time-to-value with insights gleaned from operational data.
According to Hadoop Weekly*, “The Apache Drill project has one of the fastest release velocities in the Hadoop ecosystem with a new release nearly every month.” Version 1.6 of Apache Drill, which is now available on the MapR Converged Data Platform, offers a new MapR Database document database plugin, enhanced performance and scale, and optimized Tableau and BI tool experience.
Interest and adoption of Drill, which was recognized as one of the best in open source big data technologies, continues to grow in popularity. Thousands of users have downloaded Drill and numerous organizations have it in production, interactively analyzing up to PBs of data. Additionally, over 6,000 BI analysts and developers worldwide have completed Drill training courses provided by the free On-Demand Training program from MapR.
Apache Drill is a game changer for us,” said Edmon Begoli, CTO of PYA Analytics. “Most recently, we have been able to query, in under 60 seconds, two years worth of flat PSV files of claims, billing, and clinical data from commercial and government entities, such as the Centers for Medicaid and Medicare Services. Drill has allowed us to bypass the traditional approach of ETL and data warehousing, convert flat files into efficient formats such as Parquet for improved performance, and use plain SQL against very large volumes of files."
Highlights of Drill 1.6 include:
- Flexible and operational analytics on NoSQL – The new MapR Database document plugin allows analysts to perform SQL queries directly on JSON data stored in MapR Database tables. There are a variety of pushdown capabilities available with this plugin to provide optimal interactive experience.
- Enhanced query performance – Provides better query performance on data in Hadoop and NoSQL systems via numerous query planning improvements, such as partition pruning, metadata caching and other optimization improvements. Delivers up to 10-60X performance gains in query planning compared to the previous releases of Drill.
- Better memory management – Delivers greater stability and scale which enables customers to run not only larger but also more SQL workloads on a MapR cluster.
- Improved integration with visualization tools like Tableau – Offers metadata query performance improvements and introduces client impersonation for end-to-end security from the visualization tool to data in Hadoop. Version 1.6 also provides enhanced SQL Window functions.
Drill is used in a variety of use cases. For example, media companies can instantly query and analyze incoming content delivery network (CDN) files without requiring data transformations, allowing them to analyze several terabytes of CDN logs and reduce customer attrition. High-tech chip manufacturers can develop offerings that allow them to better analyze dropped calls and provide that information to their handheld device partners and thereby improve quality of service. Communications providers can instantly query and analyze logs from cell towers that enable mobile operators to proactively monitor and improve subscriber experience.
“Operational analytics on document databases such as MapR Database is a rapidly growing use case,” said Neeraja Rentachintala, senior director, Product Management, MapR Technologies. “For the first time, there is a stack that allows BI developers and business analysts to store and query data in native formats without cumbersome ETL or transformation, providing end-to-end flexibility and scale.”
*Hadoop Weekly #162, March 20, 2016
About MapR Technologies
MapR Technologies is a visionary Silicon Valley software company and creator of the next-generation data platform for AI and analytics, with the scale and reliability required by enterprise-grade, mission-critical deployments. The MapR Data Platform delivers the power of dataware to accelerate data-driven innovation. Forward leaning companies such as Cisco, Philips, and Société Générale, are able to create new data-driven solutions to outperform the competition. Learn more: mapr.com.
MapR is a registered trademark of MapR Technologies, Inc. in the United States and other countries. Other names and brands may be the property of others.