As of MapR 4.0.2, you can use zero configuration failover. With zero configuration failover, the ResourceManager role is installed on two or more nodes but the ResourceManager process only runs on one node in the cluster.
If the node running the ResourceManager process fails and the Warden on that node is unable to restart it, the Warden on each node and Zookeeper work together to start a ResourceManager process on the cluster. ResourceManager clients connect to the Zookeeper to determine which ResourceManager node is active. Therefore, when failover occurs, the Resource Manager clients are not affected as they automatically connect to the active ResourceManager. For more information, see Warden and Failover.
maprcli service listcommand, the state of the active ResourceManager process displays as 2 (running) but the other ResourceManagers displays as 5 (stand by).
Zero Configuration Failover Administration
This section contains the following topics:
Enabling Zero Configuration Failover for the ResourceManager
To enable zero configuration failover, do not specify the -RM parameter when you run configure.sh on each node in the cluster. However, for failover to occur, at least two nodes in the cluster must have the ResourceManager role.
For example, if the cluster includes multiple nodes with the ResourceManager role, you can run the following configure.sh command on each cluster node and no further configuration is required:
configure.sh automatically populates yarn-site.xml with the following configuration:
For more information about the ResourceManager properties in yarn-site.xml, see ResourceManager Configuration Properties.
Updating ResourceManager Ports
To simplify the failover configurations in the yarn-site.xml, Warden maintains the list of ResourceManager ports in the warden.resourcemanager.conf file. For a list of the default port number, see Ports Used by MapR. If you want to edit the default ResourceManager ports, edit the warden.resourcemanager.conf file and the yarn-site.xml file on each ResourceManager node.
To update the port numbers, edit the values in warden.resourcemanager.conf and add the values in yarn-site.xml.
Open the warden.resourcemanager.conf (
Edit the port numbers, which are listed using the following format:
service.extinfo.<port>= <port number>
Port Name Property Name in warden.resourcemanager.conf ResourceManager Scheduler RPC (for ApplicationMasters) service.extinfo.SCHEDULER_PORT ResourceManager Resource Tracker RPC (for NodeManagers) service.extinfo.RESOURCETRACKER_PORT ResourceManager Client RPC service.port ResourceManager Admin RPC service.extinfo.ADMIN_PORT ResourceManager Web UI (HTTP) service.extinfo.WEBAPP_PORT ResourceManager Web UI (HTTPS) service.extinfo.WEBAPP_HTTPS_PORT
Open the yarn-site.xml file (
For each port that you edited, add the associated property to the yarn-site.xml file:
Port Name Property Name in yarn-site.xml ResourceManager Scheduler RPC (for ApplicationMasters) yarn.resourcemanager.scheduler.address
ResourceManager Resource Tracker RPC (for NodeManagers)
yarn.resourcemanager.resource-tracker.address ResourceManager Client RPC yarn.resourcemanager.address ResourceManager Admin RPC yarn.resourcemanager.admin.address ResourceManager Web UI (HTTP) yarn.resourcemanager.webapp.address ResourceManager Web UI (HTTPS) yarn.resourcemanager.webapp.https.address
For example, to update the port number for the ADMIN_PORT to 9000 on each node, enter the following in the yarn-site.xml file on each node:
Restart the Warden and the ResourceManager service.
Switching from Zero Configuration to Manual or Automatic Failover
You can change your ResourceManager failover implementation from zero configuration to manual or automatic failover by re-configuring all the cluster and client nodes.
For more information, see Configuring Manual Failover for the Resource Manager or Configuring Automatic Failover for the Resource Manager.