The FairScheduler is a pluggable scheduler for Hadoop that allows YARN applications to share resources in a large cluster fairly.
What is Fair Scheduling?
Fair scheduling is a method of assigning resources to applications such that all applications get, on average, an equal share of resources over time. Hadoop 2.x is capable of scheduling multiple resource types.
By default, the Fair Scheduler bases scheduling fairness decisions only on memory. It can be configured to schedule resources based on memory and CPU. When only one application is running, that application uses the entire cluster. When other applications are submitted, resources that free up are assigned to the new applications, so that each application eventually gets approximately the same amount of resources. Unlike the default Hadoop scheduler, which forms a queue of applications, this lets short applications finish in reasonable time while not starving long-lived applications. It is also a reasonable way to share a cluster between a number of users. Finally, fair sharing also uses priorities applied as weights to determine the fraction of total resources that each application should get.
The scheduler organizes applications further into queues, and shares resources fairly between these queues. By default, all users share a single queue, named
default. If an application specifically lists a queue in a container resource request, the request is submitted to that queue. You can also assign queues based on the user name included with the request through configuration. Within each queue, a scheduling policy is used to share resources between the running applications. The default is memory-based fair sharing, but FIFO and multi-resource with Dominant Resource Fairness can also be configured.
Queues can be arranged in a hierarchy to divide resources, and they can be configured with weights to share the cluster in specific proportions. The Fair Scheduler uses a concept called a queue path to configure a hierarchy of queues. The queue path is the full path of the queue's hierarchy, starting at root. The following example has three top-level child-queues a, b, and c and some sub-queues for a and b:
In addition to providing fair sharing, the Fair Scheduler allows assigning guaranteed minimum shares to queues, which is useful for ensuring that certain users, groups or production applications always get sufficient resources. When a queue contains apps, it gets at least its minimum share, but when the queue does not need its full guaranteed share, the excess is split between other running apps. This lets the scheduler guarantee capacity for queues while utilizing resources efficiently when these queues do not contain applications.
Configuring the Fair Scheduler
The Fair Scheduler lets all applications run by default, but you can also limit the number of running applications per user and per queue through the configuration file. This can be useful when a user must submit hundreds of applications at once, or in general to improve performance if running too many applications at once would cause too much intermediate data to be created or too much context-switching. Limiting the applications does not cause any subsequently submitted applications to fail; it only causes them to wait in the scheduler's queue until earlier applications finish.
To customize the Fair Scheduler, set configuration properties in yarn-site.xml and update the allocation file to list existing queues and their respective weights and capacities. The allocation file is automatically created during MapR installation, The default location for this file is:
The file is reloaded every 10 seconds to refresh the scheduler with any modified settings that are specified in the file.
Specifying Fair Scheduler Configuration Properties in
yarn-site.xml file contains parameters that determine scheduler-wide options. These properties include:
Specifies the path to the allocation file. If a relative path is given, the file is searched for on the classpath.
|true||Indicates whether to use the username associated with the allocation file as the default queue name, if a queue name is not specified.|
Note: If a queue placement policy is given in the allocations file, this property is ignored.
|false||Indicates whether to use preemption.|
|false||Indicates whether to assign shares to individual applications based on their size, rather than providing an equal share to all applications regardless of size. When set to true, applications are weighted by (ln 1 + <application's total requested memory>)/ ln 2.|
|false||Indicates whether to allow multiple container assignments in one heartbeat.|
Assigning Weights in an Allocation File
An allocation file is an XML manifest that describes queues and their properties, as well as certain policy defaults. The format contains these types of elements:
- User elements
Queue elements can contain the following properties:
|Minimum resources the queue is entitled to, in the form |
If a queue's minimum share is not satisfied, it will be offered available resources before any other queue under the same parent.
Under the single-resource fairness policy, a queue is considered unsatisfied if its memory usage is below its minimum memory share.
Under dominant resource fairness, a queue is considered unsatisfied if its usage for its dominant resource with respect to the cluster capacity is below its minimum share for that resource. If multiple queues are unsatisfied in this situation, resources go to the queue with the smallest ratio between relevant resource usage and minimum. Note that a queue that is below its minimum may not immediately get up to its minimum when it submits an application, because jobs that are already running may be using those resources.
|Maximum resources a queue is allowed, in the form |
The limit for the number of applications that can run at once for the queue and any of its child queues.
The queueMaxAppsDefault value is used for any parent queue that does not set a value for the
|Applies a weight to share the cluster non-proportionally with other queues. Default is 1.|
A queue with weight 2 should receive approximately twice as many resources as a queue with the default weight.
|Sets the scheduling policy of any queue.|
Allowed values are
The default is
If set to
|A list of users and/or groups that can submit applications to the queue. Refer to the ACLs section below for more information on the format of this list and how queue ACLs work.|
|A list of users and/or groups that can administer a queue. Currently, the only administrative action is killing an application. Refer to the ACLs section below for more information on the format of this list and how queue ACLs work.|
|The number of seconds the queue is under its minimum share before it tries to preempt containers to take resources from other queues.|
User elements represent settings that govern the behavior of individual users. They can contain a single property:
maxRunningApps, which limits the number of running applications for a particular user.
userMaxAppsDefault element sets the default running application limit for any users whose limit is not otherwise specified.
queueMaxAppsDefault element sets the default running application limit for queues whenever
maxRunningApps is not set for that queue.
queueMaxAppsDefaultand do not set a value for
maxRunningAppsfor the root queue, the value of
queueMaxAppsDefaultsets the application limit for all queues under the root queue hierarchy.
fairSharePreemptionTimeout element sets the number of seconds a queue is under its fair share before it tries to preempt containers to take resources from other queues.
defaultQueueSchedulingPolicy element sets the default scheduling policy for queues. This element is overridden by the
schedulingPolicy element in each queue if it is specified. The default is
queuePlacementPolicy element contains a list of rule elements that tell the scheduler how to place incoming applications into queues. Rules are applied in the order that they are listed. Rules can take arguments. All rules accept the
create argument, which indicates whether the rule can create a new queue. The
create argument defaults to true; if set to false and the rule would place the application in a queue that is not configured in the allocation file, the scheduler continues on to the next rule. The last rule must be one that can never issue a continue. Valid rules are:
specifiedThe application is placed into the queue it requested. If the application did not request a queue (it specified
default), continue to the next rule.
The application is placed into a queue with the name of the user who submitted it.
The application is placed into a queue with the name of the primary group of the user who submitted it.
The application is placed into a queue with a name that matches a secondary group of the user who submitted it. The first secondary group that matches a configured queue will be selected.
The application is placed into the queue named
The application is rejected.
Example Allocation File
For backward compatibility with the original FairScheduler, queue elements can instead be named as pool elements.
Queue Access Control Lists
Queue Access Control Lists (ACLs) allow administrators to control who may take actions on particular queues. They are configured with the
aclAdministerApps properties, which can be set per queue. Currently the only supported administrative action is killing an application. Anyone who has permission to administer a queue may also submit applications to it. These properties take values in a format like
user1,user2 group1,group2 or
group1,group2. An action on a queue will be permitted if its user or group is in the ACL of that queue or in the ACL of any of that queue's ancestors. So if queue2 is inside queue1, and user1 is in queue1's ACL, and user2 is in queue2's ACL, then both users may submit to queue2.
The root queue's ACLs are "*" by default which, because ACLs are passed down, means that everyone may submit to and kill applications from every queue. To restrict access, change the root queue's ACLs to something other than "*".
By default, the
yarn.admin.acl property in
yarn-site.xml is also set to "*", which means any user can be the administrator. If queue ACLs are enabled, you also need to set the
yarn.admin.acl property to the correct admin user for the YARN cluster. For example:
If you do not set this property correctly, users will be able to kill YARN jobs even when they do not have access to the queues for those jobs.