MapR 5.0 Documentation : Label-based Scheduling for YARN Applications

Label-based scheduling provides job placement control on a multi-tenant hadoop cluster. Using label-based scheduling, an administrator can control exactly which nodes are chosen to run jobs submitted by different users and groups. This is useful for data locality and multi-tenancy use cases.

To use label-based scheduling, an administrator assigns node labels in a text file, then composes queue labels or job labels based on the node labels. When you run jobs, you can place them on specified nodes on a per-job basis (using a job label) or at the queue level (using a queue label).

This feature is used in conjunction with schedulers, such as the Fair Scheduler or the Capacity Scheduler.

Label-based scheduling is explained in the following sections:


Sample Cluster Configuration

To illustrate the concept of label-based scheduling, consider a cluster with two racks: RackA and RackB. The nodes in RackA are dedicated to the Production group, and the nodes in RackB are dedicated to the Development group. In addition, some nodes are configured with a fast CPU, one node is configured with high memory, and one node is configured with both a fast CPU and high memory. The following diagram illustrates the cluster configuration:

Creating a Node Labels File

The node labels file is a text file stored in MapR-FS that maps labels to nodes. A node label applies a name to a cluster node, to identify it for the purpose of specifying where to run MapReduce jobs (MRv1) or MapReduce YARN applications (MRv2).

Syntax for Node Labels File

Each line in the node labels file consists of an identifier (a regular expression that identifies one or more nodes), whitespace to separate the identifier from the labels, then one or more labels (separated by commas, whitespace, or both) to apply to the specified nodes. If a label contains two or more words (such as "High Memory"), enclose the name in single or double quotation marks so the whitespace will not be interpreted as a delimiter between two labels.

<identifier> <label1>[,<label2>,...,<labeln>]

Labels must not start with a digit.

The identifier specifies nodes by matching the node names or IP addresses in one of two ways:

The identifier must match the fully qualified domain name (FQDN). To determine the FQDN, run the command hostname --fqdn.

Sample Node Labels File

The following node labels file is written for the sample cluster configuration and uses glob identifiers. Note that the FQDNs include the realm company.com.

perfnode20*.company.com RackA, Production
perfnode200.company.com Fast
perfnode201.company.com 'High Memory'
perfnode202.company.com Fast, 'High Memory'
perfnode204.company.com Fast

devnode10*.company.com RackB, Development
devnode100.company.com Fast

Using Node Labels to Schedule YARN Applications

To set up node labels for the purpose of scheduling YARN applications (including MapReduce applications) on a specific node or group of nodes:

  1. Create a text file and specify the labels you want to use for the nodes in your cluster. In this example, the file is named node.labels.

  2. Copy the file to a location on MapR-FS where it will not be modified or deleted, such as /var/mapr.

    hadoop fs -put ~/node.labels /var/mapr
  3. Edit yarn-site.xml on all ResourceManager nodes and set the node.labels.file parameter and the optional node.labels.monitor.interval parameter as shown:

    <property>
       <name>node.labels.file</name>
       <value>/var/mapr/node.labels</value>
       <description>The path to the node labels file.</description>
    </property>
    
    <property>
       <name>node.labels.monitor.interval</name>
       <value>120000</value>
       <description>Interval for checking the labels file for updates (default is 120000 ms)</description>
    </property>
  4. Restart ResourceManager to set up the labels from the node labels file for the first time. For subsequent changes to take effect, issue either of the following commands to manually tell the ResourceManager to reload the node labels file:

    1. For any YARN applications, including MapReduce jobs, enter yarn rmadmin -refreshLabels

    2. For MapReduce jobs, enter mapred job -refreshLabels

  5. Verify that labels are implemented correctly by running either of the following commands:

    yarn rmadmin -showLabels
    mapred job -showLabels 

The following flowchart summarizes these steps. In addition, the flowchart introduces the concept of queue labels for the Fair Scheduler and the Capacity Scheduler.

 

Creating Queue Labels

Queue labels are optional with label-based scheduling. You can use queue labels to determine which nodes an application or job can run on (subject to the queue label policy). A queue label is created from node labels as explained below.

Queue Label Expressions

You can create queue labels from node labels by using a label expression. A label expression for a queue (such as RackA && RackB) determines on which nodes an application submitted to that queue may run (subject to the queue label policy).

A queue label expression is a logical combination of node labels that can include these operators:

  • && (AND)

  • || (OR)

  • ! (NOT)

Defining Queue Labels for Fair Scheduler

By default, all users share a single queue, named default

To customize the Fair Scheduler, create an allocation file that lists existing queues and their respective weights and capacities, as explained in Hadoop 2.x Fair Scheduler. In the allocation file, add the following property within the queue section:

<label>labelname</label>

For example:

<queue name="Customer Data Analysis">
   <weight>2.0</weight>
   <label>Fast</label>
</queue>

For a hierarchical queue, note that labels and label policies can be defined on any level of the queue. If a child queue does not have its own labels or label policies, the closest level’s labels and label policies are used.

Defining Queue Labels for Capacity Scheduler

The Capacity Scheduler has a pre-defined queue called root. All queues in the system are children of the root queue. Further queues can be set up by configuring yarn.scheduler.capacity.root.queues with a list of comma-separated child queues.

If a parent queue has child queues, this is how labels and label policies are applied to the child queues:

  • If the child queues have their own labels or label policies, they are used.

  • If the child queues do not have their own labels or label policies, the parent queue’s labels and label policies apply.

To use label-based scheduling with the Capacity Scheduler, add the following property to the capacity-scheduler.xml file:

yarn.scheduler.capacity.root.<queue-name>.label 

For example, for a queue named alpha, the label could be defined like this:

<property>
   <name>yarn.scheduler.capacity.root.alpha.label</name> 
   <value>Fast||Development</value> 
</property>

When you make changes to queue labels or queue policies, remember to refresh them by running the following command:

yarn rmadmin -refreshQueues

For more information, see Hadoop 2.x Capacity Scheduler.

Defining a Queue Label Policy

A queue label policy determines whether an application label or a queue label prevails when there is a conflict between the two. For YARN applications, you can set the following queue label policies:

  • PREFER_QUEUE — always use label set on queue

  • PREFER_APP — always use label set on application

  • AND (default) — application label AND queue label

  • OR — application label OR queue label

See the following sections for directions on setting the queue label policy for Fair Scheduler and Capacity Scheduler.

Setting Queue Label Policies for Fair Scheduler

To set a queue label policy for a Fair Scheduler queue, specify the label policy in the corresponding queue section of the allocation file, as shown here:

 

<queue name="CustomerDataAnalysis">
      <weight>2.0</weight>
      <labelPolicy>OR</labelPolicy>
      <label>Fast</label>
</queue>

Setting Queue Label Policies for Capacity Scheduler

To set a queue label policy for a capacity queue, add the following property to the capacity-scheduler.xml file:

yarn.scheduler.capacity.root.<queue-name>.label-policy

For example:

<property>
  <name>yarn.scheduler.capacity.root.alpha.label-policy</name>
  <value>PREFER_APP</value>
</property>

 

Examples of Queue Policy Behavior

The following examples show the job placement policy behavior in various scenarios, based on the sample node labels file.

Application Label

Queue Label

Queue Policy

Outcome

Fast

High Memory

PREFER_APP

The job runs on nodes labeled Fast (hostnames match perfnode200, perfnode202, perfnode204, or devnode100)

Fast

High Memory

PREFER_QUEUE

The job runs on nodes labeled High Memory (hostnames match perfnode201 or perfnode202)

Fast

High Memory

AND

The job runs on nodes only if they are labeled both Fast and High Memory (hostname matches perfnode202)

Fast

High Memory

OR

The job runs on nodes if they are labeled either Fast or High Memory (hostnames match perfnode200, perfnode201, perfnode202, perfnode204, or devnode100)

Creating Job Labels

To place an individual job on a particular node, use a job label. You can apply job labels in three ways:

  • Use set() from the Hadoop configuration API in your Java application. For example:

    conf.set("mapreduce.job.label","Production");
  • At the command line, pass the label in -Dmapreduce.job.label when you run the job with the hadoop jar command. For example:

    hadoop jar /opt/mapr/hadoop/hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1-mapr-4.0.1-20140804.191359-4.jar teragen 
    -Dmapreduce.job.label=Production 100000000 /teragen
  • Set the mapreduce.job.label parameter in mapred-site.xml. For example, to configure an application label expression for a MapReduce job that can run on any Development node or Fast node (including a Fast node from the Production rack), set the mapreduce.job.label parameter as shown:

    <property>
       <name>mapreduce.job.label</name>
       <value>Development || Fast</value>
       <description>Label expression for MapReduce job</description>
    </property>

    If an application is submitted with a label that does not correspond to any nodes, the application will run as though no label had been specified. Consult the ResourceManager log for more information (look for ‘invalid label’ error messages).

Attachments: