MapR 5.0 Documentation : Zero Configuration Failover for the ResourceManager

As of MapR 4.0.2, you can use zero configuration failover. With zero configuration failover, the ResourceManager role is installed on two or more nodes but the ResourceManager process only runs on one node in the cluster.

If the node running the ResourceManager process fails and the Warden on that node is unable to restart it, the Warden on each node and Zookeeper work together to start a ResourceManager process on the cluster. ResourceManager clients connect to the Zookeeper to determine which ResourceManager node is active. Therefore, when failover occurs, the Resource Manager clients are not affected as they automatically connect to the active ResourceManager.  For more information, see Warden and Failover.

When you run maprcli service list command, the state of the active ResourceManager process displays as 2 (running) but the other ResourceManagers displays as 5 (stand by).

Zero Configuration Failover Administration

This section contains the following topics:

Enabling Zero Configuration Failover for the ResourceManager

To enable zero configuration failover, do not specify the -RM parameter when you run on each node in the cluster. However, for failover to occur, at least two nodes in the cluster must have the ResourceManager role.

For example, if the cluster includes multiple nodes with the ResourceManager role, you can run the following command on each cluster node and no further configuration is required:

/opt/mapr/server/ -N mycluster -C centos21 -Z centos21 -HS centos22 -F /tmp/disks.txt -disk-opts F automatically populates yarn-site.xml with the following configuration: 

<!-- Resource Manager MapR HA Configs -->
	<description>MapR Zookeeper based RM Reconnect Enabled. If this is true, set the failover proxy to be the class MapRZKBasedRMFailoverProxyProvider</description>
	<description>Zookeeper based reconnect proxy provider. Should be set if and only if mapr-ha-enabled property is true.</description>
	<description>RM Recovery Enabled</description>

For more information about the ResourceManager properties in yarn-site.xml, see ResourceManager Configuration Properties

Updating ResourceManager Ports

To simplify the failover configurations in the yarn-site.xml, Warden maintains the list of ResourceManager ports in the warden.resourcemanager.conf file. For a list of the default port number, see Ports Used by MapR.  If you want to edit the default ResourceManager ports, edit the warden.resourcemanager.conf file and the yarn-site.xml file on each ResourceManager node.

If each node requires different ResourceManager ports, you must maintain a separate yarn-site.xml for each node. Therefore, to you use Central Configuration, you must create a customized configuration file for each ResourceManager node in the cluster.

To update the port numbers, edit the values in warden.resourcemanager.conf and add the values in yarn-site.xml.

  1. Open the warden.resourcemanager.conf (/opt/mapr/conf/conf.d/warden.resourcemanager.conf). 

  2. Edit the port numbers, which are listed using the following format: service.extinfo.<port>= <port number> 

    Port NameProperty Name in warden.resourcemanager.conf
    ResourceManager Scheduler RPC (for ApplicationMasters)service.extinfo.SCHEDULER_PORT
    ResourceManager Resource Tracker RPC (for NodeManagers)service.extinfo.RESOURCETRACKER_PORT
    ResourceManager Client RPCservice.port
    ResourceManager Admin RPCservice.extinfo.ADMIN_PORT
    ResourceManager Web UI (HTTP)service.extinfo.WEBAPP_PORT
    ResourceManager Web UI (HTTPS)service.extinfo.WEBAPP_HTTPS_PORT
  3. Open the yarn-site.xml file (/opt/mapr/hadoop/hadoop-2.x.x/etc/hadoop/yarn-site.xml).

  4. For each port that you edited, add the associated property to the yarn-site.xml file:

    Port NameProperty Name in yarn-site.xml
    ResourceManager Scheduler RPC (for ApplicationMasters)yarn.resourcemanager.scheduler.address

    ResourceManager Resource Tracker RPC (for NodeManagers)

    ResourceManager Client RPCyarn.resourcemanager.address
    ResourceManager Admin RPCyarn.resourcemanager.admin.address
    ResourceManager Web UI (HTTP)yarn.resourcemanager.webapp.address
     ResourceManager Web UI (HTTPS)yarn.resourcemanager.webapp.https.address

    For example, to update the port number for the ADMIN_PORT to 9000 on each node, enter the following in the yarn-site.xml file on each node:

  5. Restart the Warden and the ResourceManager service.

Switching from Zero Configuration to Manual or Automatic Failover

You can change your ResourceManager failover implementation from zero configuration to manual or automatic failover by re-configuring all the cluster and client nodes.

For more information, see Configuring Manual Failover for the Resource Manager or Configuring Automatic Failover for the Resource Manager.