MapReduce is one type of application that can run on the Hadoop 2.x framework. MapReduce configuration options are stored in the /opt/mapr/hadoop/hadoop-2.x.x/etc/hadoop/mapred-site.xml file and are editable by the root user. This file contains configuration information that overrides the default values for MapReduce parameters. Overrides of the default values for core configuration properties are stored in the MapR Parameters file.

To override a default value for a property, specify the new value within the <configuration> tags, using the following format:

 <name> </name>
 <value> </value>
 <description> </description>

Configurations for MapReduce Applications

Parameter Value Description yarn Execution framework set to Hadoop YARN. 1024 Larger resource limit for maps. -Xmx1024M Larger heap-size for child jvms of maps.
mapreduce.reduce.memory.mb 3072 Larger resource limit for reduces. -Xmx2560M Larger heap-size for child jvms of reduces. 512 Higher memory limit while sorting data for efficiency. 100 More streams merged at once while sorting files.
mapreduce.reduce.shuffle.parallelcopies 50 Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.

Configurations for MapReduce JobHistory Server

Parameter Value Description
mapreduce.jobhistory.address MapReduce JobHistory Server host:port Default port is 10020.
mapreduce.jobhistory.webapp.address MapReduce JobHistory Server Web UI host:port Default port is 19888.
mapreduce.jobhistory.intermediate-done-dir /mr-history/tmp Directory where history files are written by MapReduce applications.
mapreduce.jobhistory.done-dir /mr-history/done Directory where history files are managed by the MapReduce JobHistory Server.
mapreduce.jobhistory.webapp.https.address Secure MapReduce JobHistory Server Web UI host:port (HTTPS) Default port is 19890

Sample Hadoop 2.x mapred-site.xml File

The following mapred-site.xml file defines values for two job history parameters.