Drill (MEP 6.2.0) Release Notes

This section provides reference information, including new features, improvements, resolved issues, known issues, and limitations for Drill

These release notes contain MapR-specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.

The following release notes apply to the version of the Drill component:

Release Date May 2019
MapR Version Interoperability See Component Versions for Released MEPs.
Package Names Navigate to https://package.mapr.com/releases/MEP/, and select your MEP and OS to view the list of package names, for example:
  • mapr-drill-
  • mapr-drill-internal-
  • mapr-drill-yarn-

New in this Release

Drill includes the following new features and improvements in the following areas:
Configuration Options
  • A new Drill configuration option, store.hive.maprdb_json.read_timestamp_with_timezone_offset, enables Drill to read timestamp values with a timezone offset when using the hive plugin with the Drill native MaprDB JSON reader enabled. This option is disabled by default. (MD-5272)
Web UI
Several Web UI improvements, including:
SQLLine (Drill shell)
  • Upgrade to Calcite 1.18.0. (MD-5050)

For a list of additional features and improvements, see the Apache Drill 1.16 release notes.

Resolved Issues

Drill includes the following resolved issues and improvements:
MapR Tracking Number Resolved Issue
MD-5673 DRILL-7150: Drill timestamp timezone conversion uses current daylight savings time instead of the one active during timestamp date
MD-5647 DRILL-7118: Filter not getting pushed down on MapR-DB tables.
MD-5638 DRILL-7130: IllegalStateException: Read batch count [0] should be greater than zero
MD-5630 DRILL-7113: Drill on MapRDB can not understand null value
MD-5624 DRILL-7125: REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0
MD-5623 Unable to connect to Drill 1.15 through ZK
MD-5609 DRILL-7079: Drill can't query views from the S3 storage when plain authentication is enabled
MD-5606 DRILL-7100: parquet RecordBatchSizerManager : IllegalArgumentException: the requested size must be non-negative
MD-5561 DRILL-7060: Query on audit logs fails by DATA_READ ERROR Error Parsing JSON - Unrecognized character escape 'S' (code 83)
MD-5552 DRILL-7119: Modify selectivity calculations to use histograms
MD-5550 DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option
MD-5523 Physical plan generation failure after upgrade from 1.10 to 1.14
MD-5490 DRILL-7117: Support creation of column Histograms for numeric data types
MD-5428 Include links to pre and post procedures in Drill upgrade documentation
MD-5379 DRILL-7018: Drill Query (when store.parquet.reader.int96_as_timestamp=true) on Parquet File fails with Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: 372 (expected: 0 <= readerIndex <= writerIndex <= capacity(256))
MD-5374 DRILL-6971: Display query state in query result page of Web UI
MD-5369 DRILL-7115: Improve Hive schema show tables performance
MD-5368 DRILL-4858: repeated_count on JSON array of objects (maps) implementation is **missing** in Drill 1.14
MD-5363 DRILL-4858: Missing function implementation: [repeated_count(LIST-REPEATED)]
MD-5356 DRILL-4858: Implement - Missing function implementation: [repeated_count(MAP-REPEATED)].
MD-5348 DRILL-6997: TPCDS queries 56, 60, 83 are slower with plan change
MD-5330 DRILL-6967: TIMESTAMPDIFF returns incorrect value for SQL_TSI_QUARTER
MD-5319 DRILL-6997: TPCDS query 95 slower with plan change
MD-5278 DRILL-6931: Drill "SHOW FILES" command duplicates empty S3 folders as subfolders
MD-5277 DRILL-6928: exec.query.return_result_set_for_ddl does not affect Web UI query results
MD-5272 DRILL-6969: Drill on maprdb native reader reads a wrong timezone comparing to hive
MD-5253 to_timestamp function is losing precision for milliseconds
MD-5251 DRILL-6894: CTAS and CTTAS are not working on S3 storage when cache is disabled
MD-5236 DRILL-7023: Tableau query fails with IndexOutOfBoundsException after upgrade from drill 1.13.0 to drill 1.14.0
MD-5226 DRILL-6918: Querying empty topics fails with "NumberFormatException"
MD-5198 DRILL-6880: TPCDS query 35 slower due to nulls
MD-5179 DRILL-6874: CTAS from json to parquet is not working on S3 storage
MD-5095 DRILL-7051: Update Drill's Jetty Server to 9.3
MD-4863 Simba JDBC driver does not return some values
MD-4862 Simba JDBC driver returns incorrect time value
MD-4826 The COALESCE function returns results when the columns referenced in the function do not exist in the files being queried. You do not have to CAST the columns to a specific data type for the COALESCE function to return results.
MD-4617 "direct.used" metrics(jvm_direct_current) doesn't catch the direct memory usage.
MD-4362 Query on data containing reserved word 'date' as column name fails to generate non-covering index plan
MD-3723 Querying Hbase row_key column with non- existing column returns different results in different Drill Versions
MD-1585 Need More Accurate Filter Estimation Before Running a Query
MD-1008 DRILL-7038: Performance - Queries on partitioned columns currently scan the entire datasets
MD-880 HashJoin's not fully parallelized in query plan
MD-680 DRILL-7069: Planning time unaccounted for query with longer planning time

Known Issues

Drill has the following known issues:
MapR Tracking Number Known Issue
MD-5792 TPCH query 5 runs 10-20% slower at sf100/sf1000, possibly due to hash join ordering
MD-5786 TPCDS query 98 is 2x slower with Statistics enabled due to hash join order for sf100 and sf1000
MD-5782 Need better error message when analyze command fails due to schema change
MD-5770 TPCH query 9 runs 18% slower at sf 100/sf1000, possibly due to hash join
MD-5758 TPCDS query 78 runs 30x slower with Statistics enabled at sf100
MD-5755 DirectScan lists all partitions in explain plan, even for full table scan
MD-5744 [DRILL-7216] Auto limit is happening on the Drill Web-UI while the limit check box is unchecked
MD-5740 REFRESH TABLE METADATA does not count null values for decimal, varchar, and interval data types.
MD-5694 The first query to use a new metadata cache file may take a while to run because the first query triggers a refresh of the metadata cache file.
MD-5684 Drill timeout when querying a large number of files
MD-5676 Drill parquet file may not have statistics for decimal and varchar data types.
MD-5608 Running analyze command on a view fails correctly but the error is confusing
MD-5528 Compute stats on non existent columns fails with exception
MD-5388 Running analyze cmd on duplicate column names is resulting in IndexOutOfBoundsException
MD-5371 Error msg not clear when analyze cmd is run on table with complex types
MD-5342 DRILL-6839 : regarding aggs in cross join queries