Running MapReduce Jobs and YARN Applications

If you upgraded from 4.x or later, the cluster is ready to run MapReduce jobs and YARN applications. If you upgraded from 3.x, you need to prepare the cluster to run MapReduce v1 jobs and YARN applications.

Running MapReduce v1 Jobs

The MapR cluster includes the Hadoop 2.x architecture and it starts up with MapReduce V2 (YARN) as the default operating mode. Before you run MapReduce v1 jobs in the cluster, you may need to recompile the job due to API changes and you may want to consider changing the default MapReduce operating mode. MapReduce V1 jobs will not run unless you change the default MapReduce mode, as described in this procedure, or submit them with the appropriate command. For more information about the MapReduce mode and how it affects job submission, see Managing the MapReduce Mode.

Running YARN applications

YARN services are required to run YARN applications, such as MapReduce v2 and other applications that can run on YARN. To run YARN applications on the cluster, perform the YARN-related steps in this procedure.

  1. Determine if you need to recompile MapReduce v1 jobs by comparing the classes and methods that changed with the classes and methods used in your application, and if you find a match, recompile your application.
    When an application has been compiled against MapReduce V1 or MapReduce V2 (YARN), the application can be run in either mode.
  2. Prepare the cluster for the appropriate task.
    • Run only MapReduce v1 jobs: Change the default MapReduce mode to classic.

      maprcli cluster mapreduce set -mode classic

    • Run YARN applications: Add and configure YARN roles, such as ResourceManager, NodeManager, and HistoryServer on cluster nodes.
    If you want YARN to be the default MapReduce mode, determine that the default mode in effect is not YARN, and then change the mode to YARN.
    maprcli cluster mapreduce get
    maprcli cluster mapreduce set -mode yarn