Hive 2.3.3-1904 (MEP 6.2.0, MEP 6.1.1, and MEP 6.0.2) Release Notes

This section provides reference information, including new features, patches, known issues, and limitations for Hive 2.3.3-1904.

These release notes contain only MapR-specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.

Hive Version 2.3.3
Release Date April 2019
MapR Version Interoperability See Hive and HCatalog Support Matrix and Ecosystem Support Matrix and MEP Components and OS Support.
Source on GitHub https://github.com/mapr/hive
GitHub Release Tag

2.3.3-mapr-1904

Maven Artifacts See Maven Artifacts for MapR.
Package Names Navigate to https://package.mapr.com/releases/MEP/, and select your MEP and OS to view the list of package names.
ODBC/JDBC Drivers
Hive 2.3.3 works with the following MapR Hive drivers:

For additional driver information, see Connecting to HiveServer2.

Feature Support

  • MEP 6.1.0 supports Hive-2.3.3 on Tez-0.9.

    For more information, see Tez 0.9.1-1904 (MEP 6.2.0, MEP 6.1.1, and MEP 6.0.2) Release Notes.

  • MEP 6.1.0 does not support Hive on Spark, so you cannot use Spark as an execution engine for Hive.

    However, you can run Hive and Spark on the same cluster. You can also use Spark SQL and Drill to query Hive tables.

  • MEP 6.1.0 does not support HDFS encryption in Hive tables.
  • MEP 6.1.0 does not support HBase with Hive-2.3.3 starting from the 6.0.0 release.
  • MEP 6.0.0 does not support LLAP with Hive-2.3.3, because Apache Slider is not a MapR ecosystem component.
  • Hive 2.1 and later needs to run the schematool command as an initialization step.

New Features

  • MapR Database JSON projection pushdown.
  • Metrics report file /tmp/hive_report.json is split: /tmp/hiveserver2_report.json and /tmp/hivemetastore_report.json for HiveServer2 and Hive Metastore, respectively.

Changes in Security with Default Configuration

  • Added the following properties to the hive-site.xml configuration by default on a secured cluster:
    Table 1. Properties added by default to hive-site.xml
    Property Value
    hive.server2.metrics.file.location /tmp/hiveserver2_report.json
    hive.metastore.metrics.file.location /tmp/hivemetastore_report.json
  • Removed the following property from the hive-site.xml configuration by default on a secured cluster:
    Table 2. Properties added by default to hive-site.xml
    Property Value
    hive.service.metrics.file.location /tmp/hive_report.json

API Changes

The following classes are moved from hive-maprdb-json-handler-2.3.3-mapr-XXXX.jar to hive-exec-2.3.3-mapr-XXXX.jar:

  • org.apache.hadoop.hive.maprdb.json.shims.DocumentWritable.
  • org.apache.hadoop.hive.maprdb.json.shims.MapRDBJsonSplit.
  • org.apache.hadoop.hive.maprdb.json.shims.MapRDBProxy.
  • org.apache.hadoop.hive.maprdb.json.shims.RecordReaderWrapper.
  • org.apache.hadoop.hive.maprdb.json.shims.RecordWriterWrapper.

Known Issues

  • In HIVE-19502, you cannot insert values into a table stored by JdbcStorageHandler.
  • In HIVE-19286, NPE in MERGE operator on MR mode.
  • In Bug 32349, [6.1RC1] Simple fetch from MapR Database JSON tables does not work. Workaround: Set hive.fetch.task.conversion=none in the hive-site.xml file or in the Hive CLI.
  • Some select queries can be converted to single FETCH task minimizing latency. Currently the query should be single sourced not having any sub query and should not have any aggregations or distincts (which incurs RS), lateral views and joins:
    • none: Disable hive.fetch.task.conversion
    • minimal: SELECT star, filter on partition columns, LIMIT only
    • more: SELECT, filter, LIMIT only (support TABLESAMPLE and virtual columns)
  • The Hive vectorized execution feature has many bugs in Hive 2.x. It is recommended to turn off this feature at a system level and only use it for certain queries which work fine using it. You must evaluate the benefit of this feature against the potential stability issues on a case by case basis.
  • Spark does not support SSL encryption for metastore when hive.metastore.use.SSL = true is used.
    • SPARK-533 Spark Thriftserver fails with LDAP+KERBEROS+SSL configuration.

Patches

This release includes the following patches on the base Apache release. For complete details, refer to the commit log for this project in GitHub.

Commit Date (YYYY-MM-DD) Comment
4d38efc 787b53a 2019-05-17 MAPR-HIVE-499 : Most information is lost when hive log4j2 routing appender rotated logs
4a9bbca 2019-04-16 MAPR-HIVE-507 : hive.metastore.use.SSL should be set to false by default
e379881 2019-03-27 MAPR-HIVE-462: INSERT OVERWRITE LOCAL DIRECTORY fails with permission error in hive-2.3
715c0e7 2019-04-15 MAPR-HIVE-500 : CLONE - Hive timers metrics availability is not constant
e12fa76 2019-04-12 MAPR-HIVE-503 : Rename webhcat.pid to hive-mapr-webhcat.pid
68d2f5f 2019-04-08 MAPR-HIVE-476 : There is no encryption between client(HS2) and HMS server while working through maprsasl security
0f3fbbf 2019-03-29 MAPR-HIVE-485 : Disable vectorized execution in Hive by default
c6e95ae 2019-03-23 MAPR-HIVE-482 : HS2 takes time to start because of the 'get_all_databases'
c8a4882 2019-03-19 MAPR-HIVE-471 : Distribute Notice.txt across components starting with MEP 6.2
a062e71 2019-03-14 MAPR-HIVE-474 : Webhcat SSL doesn't have a valid keystore
b3e7287 2019-03-13 MAPR-HIVE-475 : Change tez version in Hive-2.3 to 0.9.1
f0c1053 2019-03-12 MAPR-HIVE-457 : Hive MR job fails with NullPointerException if we execute cleardanglingscratchdir
24c17f8 2019-03-06 MAPR-HIVE-465 : Investigate error logs in Hive Metastore after implementing Thrift v0.12.0
e58ca04 2019-03-06 MAPR-HIVE-464 : Backup hive-env.sh during backup files process
b0d4b45 2019-03-04 MAPR-HIVE-432 : CLONE - CVE-2018-1320 vulnerability in Apache Thrift
a19f0f9 2019-02-19 MAPR-HIVE-393 : Implement id / key projection pushdown for Hive - MapR-DB JSON Integration
250c986 2019-02-20 MAPR-HIVE-434 : 'Australia/Sydney' timezone conversion issues
cd2c5d5 2019-02-14 MAPR-HIVE-448 : Backup Hive configuration files only if they were changed
0449328 2019-02-14 MAPR-HIVE-449 : absence of webhcat pid file under $MAPR_PID_DIR
dfea394 2019-01-31 MAPR-HIVE-431 : Remove "hive-log4j.properties" file from $HIVE_CONF directory
0e5d4c5 2019-01-29 MAPR-HIVE-442 : SC2145: Argument mixes string and array. Use * or separate argument
4e63a15 2019-02-07 MAPR-HIVE-307 : Add -SNAPSHOT to Hive version

This release also includes the following backported issues. For complete details, refer to the commit log for this project in GitHub.

Commit Date Comment
636a671 2019-04-18 HIVE-19018: beeline -e now requires semicolon even when used with query from command line
4cdb1ad 2019-03-26 HIVE-16958: Setting hive.merge.sparkfiles=true will return an error when generating parquet databases
12b1b39 2019-03-26 HIVE-20126: OrcInputFormat does not pass conf to orc reader options
d66b657 2019-03-26 HIVE-20091: Tez: Add security credentials for FileSinkOperator output