Placing Jobs

The placement of jobs on nodes or groups of nodes is controlled by labels applied to the queues to which jobs are submitted, or to the jobs themselves. A queue or job label is an expression that uses logical operators OR, AND, NOT to specify the nodes described in the node labels file:

  • A queue label specifies the node or nodes that will run all jobs submitted to a queue.
  • A job label specifies the node or nodes that will run a particular job

Job and queue labels specify nodes using labels corresponding to the node labels file, with the operators || (OR) and && (AND).

Examples:

  • "Production Machines" || good — selects nodes that are in either the "Production Machines" group or the good group.
  • 'Development Machines' && good — selects nodes only if they are in both the 'Development Machines' group and the good group.

If a job is submitted with a label that does not include any nodes, the job will remain in the PREP state until nodes exist that meet the criteria (or until the job is killed). For example, in the node labels file above, no nodes have been assigned to both the 'Development Machines' group and the good group. If a job is submitted with the label 'Development Machines' && good, it cannot execute unless there are nodes that exist in both groups. If the node labels file is edited so that the 'Development Machines' group and the good group have nodes in common, the job will execute as soon as the JobTracker becomes aware — either after the mapreduce.jobtracker.node.labels.monitor.interval or when you execute the hadoop job - refreshlabels command.