To use network bandwidth more efficiently, use compression over the wire. If you use application-level compression, turn off MapR-FS compression and reduce the chunk size to 128MB and
io.sort.mb to 190 MB.
Disk reads can be a significant load, because there are many more reads than writes in a MapReduce job. To improve disk I/O, use MapR-FS compression on input and output volumes as well as the volumes used for intermediate files. Use Hadoop sequence files for input and output in order to avoid the overhead of converting to and from Java types in addition to enabling compression.
To turn off MapR-FS compression for map outputs, set
To turn on LZO or any other compression, set
For more details on selecting a compression algorithm, see Compression.