Working with Impala

After you start Impala, use the impala-shell or a JDBC or ODBC client to query data. You can query data stored in files, as well as data stored in MapR Database tables. Impala depends on the Hive metastore to track table metadata. MapR file system tracks the metadata of other files. Impala supports Text and Parquet file formats. If you want to query data using SequenceFile, RCFile, and Avro file formats, use Hive to load the data. Impala supports Snappy, GZIP, Deflate, and BZIP compression codecs.

Impala File Formats

The following table summarizes the supported Impala text formats:

File Type Format Compression Codecs Can Impala Create? Can Impala INSERT?
Text Unstructured Snappy, GZIP, BZIP Yes, for CREATE TABLE with no STORED AS clause;default file format is uncompressed text with values separated by ASCII 0x01 characters, typically represented a Ctrl-A Yes. CREATE TABLE, INSERT, and query.
SequenceFile Structured Snappy, GZIP, deflate, BZIP2 Yes No. Query only. Load data using Hive.
RCFile Structured Snappy, GZIP, deflate, BZIP2 Yes No. Query only. Load data using Hive.
Parquet Structured Snappy (default), GZIP Yes Yes. CREATE TABLE, INSERT, and query.
Avro Structured Snappy, GZIP, deflate, BZIP2 No, create using Hive. No. Query only. Load data using Hive.