When properly licensed and configured for HA, the MapR cluster provides automatic failover for continuity throughout the stack. Configuring a cluster for HA involves running redundant instances of specific services, and configuring NFS properly. In HA clusters, it is advisable to have 3 nodes run CLDB and 5 run ZooKeeper. In addition, 3 Hadoop JobTrackers and/or 3 HBase Masters are appropriate depending on the purpose of the cluster. Any node or nodes in the cluster can run the MapR WebServer. In HA clusters, it is appropriate to run more than one instance of the WebServer with a load balancer to provide failover. NFS can be configured for HA using virtual IP addresses (VIPs). For more information, see Setting Up VIPs for NFS.
The following are the minimum numbers of each service required for HA:
- CLDB - 2 instances
- ZooKeeper - 3 instances (to maintain a quorum in case one instance fails)
- HBase Master - 2 instances
- JobTracker - 2 instances
- NFS - 2 instances
You should run redundant instances of important services on separate racks whenever possible, to provide failover if a rack goes down. For example, the top server in each of three racks might be a CLDB node, the next might run ZooKeeper and other control services, and the remainder of the servers might be data processing nodes. If necessary, use a worksheet to plan the services to run on each node in each rack.
- If you are installing a large cluster (100 nodes or more), CLDB nodes should not run any other service and should not contain any cluster data (see Isolating CLDB Nodes).
- In HA clusters, it is advisable to have 3 nodes run CLDB and 5 run ZooKeeper. In addition, 3 Hadoop JobTrackers and/or 3 HBase Masters are appropriate depending on the purpose of the cluster.