Set Up Compression with HBase

Using compression with HBase reduces the number of bytes transmitted over the network and stored on disk. These benefits often outweigh the performance cost of compressing the data on every write and uncompressing it on every read.

GZip Compression

GZip compression is included with most Linux distributions, and works natively with HBase. To use GZip compression, specify it in the per-column family compression flag while creating tables in HBase shell. For example:
 create 'mytable', {NAME=>'colfam:', COMPRESSION=>'gz'}

LZO Compression

Lempel-Ziv-Oberhumer (LZO) is a lossless data compression algorithm, included in most Linux distributions, that is designed for decompression speed.

Snappy Compression

The Snappy compression algorithm is optimized for speed over compression. Snappy compression is included in the core MapR installation and no additional configuration is required.