The disk space balancer and the replication role balancer redistribute data in the MapR storage layer to ensure maximum performance and efficient use of space:
- The disk space balancer works to ensure that the percentage of space utilized on all storage pools in the cluster is similar, so that no nodes are overloaded.
- The replication role balancer changes the replication roles of cluster containers so that the replication process uses network bandwidth evenly.
To view balancer configuration values:
- Pipe the
maprcli config loadcommand through
To set balancer configuration values:
- Use the
config savecommand to set the appropriate values. Example:
Disk Space Balancer
The disk space balancer is a tool that balances disk space usage on a cluster by moving containers between nodes.
The disk space balancer distributes containers to storage pools in other nodes that have lower utilization than the average for that cluster. The disk space balancer checks every storage pool on a regular basis and moves containers from a storage pool when that pool's utilization meets the following conditions:
- The storage pool is over 70% full.
- The storage pool's utilization exceeds the average utilization on the cluster by a specified threshold:
- When the average cluster storage utilization is below 80%, the threshold is 10%.
- When the average cluster storage utilization is below 90% but over 80%, the threshold is 3%.
- When the average cluster storage utilization is below 94% but over 90%, the threshold is 2%.
The disk space balancer aims to ensure that the percentage of space used on all the disks in the cluster is similar.
You can view disk usage on all nodes in the Disks view, by clicking Cluster > Nodes in the Navigation pane and the choosing Disks from the dropdown.
Disk Space Balancer Configuration Parameters
Threshold for moving containers out of a given storage pool, expressed as utilization percentage.
Specifies whether the disk space balancer runs:
This can be used to throttle the disk balancer. If it is set to 10, the balancer will throttle the number of concurrent container moves to 10% of the total nodes in the cluster (minimum 2).
Disk Space Balancer Status
maprcli dump balancerinfo command to view detailed information about the storage pools on a cluster.
If there are any active container moves at the time the command is run,
maprcli dump balancerinfo returns information about the source and destination storage pools.
For more information about this command, see maprcli dump balancerinfo.
Disk Space Balancer Metrics
maprcli dump balancermetrics command returns a cumulative count of container moves and MB of data moved between storage pools since the current CLDB became the the master CLDB.
For more information about this command, see maprcli dump balancermetrics.
Replication Role Balancer
The replication role balancer is a tool that switches the replication roles of containers to ensure that every node has an equal share of master and replica containers (for name containers) and an equal share of master, intermediate, and tail containers (for data containers).
The replication role balancer changes the replication role of the containers in a cluster so that network bandwidth is spread evenly across all nodes during the replication process. A container's replication role determines how it is replicated to the other nodes in the cluster. For name containers (the volume's first container), replication occurs simultaneously from the master to all replica containers. For data containers, replication proceeds from the master to the intermediate container(s) until it reaches the tail containers. Replication occurs over the network between nodes, often in separate racks.
Replication Role Balancer Configuration Parameters
Specifies whether the role balancer runs:
This can be used to throttle the role balancer. If it is set to 10, the balancer will throttle the number of concurrent role
Replication Role Balancer Status
maprcli dump rolebalancerinfo command returns information the number of active replication role switches. During a replication role switch, the replication role balancer selects a master or intermediate data container and switches its replication role to that of a tail data container.
For more information about this command, see maprcli dump rolebalancerinfo.