Impala Limitations

What Impala Does Not Provide

Impala does not replace Hive or other frameworks built on MapReduce for long-running batch-oriented queries.

Impala is not fit as a query layer to support operational/OLTP applications (No update/deletes, not optimized for point look-ups).

Known Limitations in Impala

Impala has the following known limitations:

  • Issuing the DROP TABLE statement on an Impala table that maps to an HBase or MapR-DB table does not remove the underlying table in HBase or MapR-DB. However, if Hive is integrated with HBase or MapR-DB, the Hive DROP TABLE statement removes the table.
  • The LOAD DATA statement does not work when the source directory and destination table are in different encryption zones.
  • The Impala configuration option, --disk_spill_encryption, is not supported to secure sensitive data from being observed or tampered with when temporarily stored on disk.
  • Redaction of sensitive data from Impala log files is not supported.
  • You cannot use the lineage information feature to track who has accessed data through Impala SQL statements.

Impala UDFs (user-defined functions) have the following known limitations:

  • Impala does not work with UDFs that accept or return composite, nested, or types not available in Impala tables.
  • UDFs must produce the same output each time the same argument value is passed.
  • UDFs cannot spawn other threads or processes.
  • Prior to Impala 2.5.0, UDFs become undefined when you restart the catalog service. You must reload the UDFs.
  • You currently cannot include user-defined table functions in Impala queries.