After you have defined node topology for the nodes in your cluster, you can use volume topology to place volumes on specific racks, nodes, or groups of nodes. This section discusses how to set up both types of topology from the command line or from the MapR Control System (MCS).
Setting Up Node TopologyYour node topology describes the locations of nodes and racks in a cluster. The MapR software uses node topology to determine the location of replicated copies of data. When cluster topology is optimally defined, data is replicated to separate racks, which provides continued data availability in the event of rack or node failure.
Define your cluster's topology by specifying a topology for each node in the cluster. You can use topology to group nodes by rack or switch, depending on how the physical cluster is arranged and how you want MapR to place replicated data.
Topology paths can be as simple or complex as needed to correspond to your cluster layout. In a simple cluster, each topology path might consist of the rack only (for example,
/rack-1). In a deployment consisting of multiple large datacenters, each topology path can be much longer (for example,
/europe/uk/london/datacenter2/room4/row22/rack5/). MapR uses topology paths to spread out replicated copies of data, placing each copy on a separate path. By setting each path to correspond to a physical rack, you can ensure that replicated data is distributed across racks to improve fault tolerance.
Recommended Node Topology
The node topology described in this section enables you to gracefully migrate data off a node in order to decommission the node for replacement or maintenance while avoiding data under-replication.
- Establish a
/datatopology path to serve as the default topology path for the volumes in that cluster.
- Establish a
/decommissionedtopology path that is not assigned to any volumes.
Migrating a Volume off a Node
When you need to migrate a data volume off a particular node, move that node from the
/data path to the
/decommissioned path. Since no data volumes are assigned to that topology path, standard data replication will migrate the data off that node to other nodes that are still in the
/data topology path.
You can run the following command to check if a given volume is present on a specified node:
Run this command for each non-local volume in your cluster. Once all the data has migrated off the node, you can decommission the node or place it in maintenance mode.
If you need to segregate CLDB data, create a
/cldb topology node and move the CLDB nodes under
/cldb. Point the topology for the CLDB volume (
/cldb. See Isolating CLDB Nodes for details.
Setting Node Topology Manually
You can specify a topology path for one or more nodes using the
node move command, or in the MapR Control System using the following procedure.
To set node topology using the MapR Control System:
- In the Navigation pane, expand the Cluster group and click the Nodes view.
- Select the checkbox beside each node whose topology you wish to set.
- Click the Change Topology button to display the Change Topology dialog.
- Set the path in the New Path field:
- To define a new path, type a topology path. Topology paths must begin with a forward slash ('/').
- To use a path you have already defined, select it from the dropdown.
- Click Move Node to set the new topology.
Setting Node Topology with a Script
For large clusters, you can specify complex topologies in a text file or by using a script. Each line in the text file or script output specifies a single node and the full topology path for that node in the following format:
<ip or hostname> <topology>
The text file or script must be specified and available on the local filesystem on all CLDB nodes:
- To set topology with a text file, set
/opt/mapr/conf/cldb.confto the text file name
- To set topology with a script, set
/opt/mapr/conf/cldb.confto the script file name
If you specify a script and a text file, the MapR system uses the topology specified by the script.
Setting Up Volume Topology
MapR supports data placement control, in which you can place a volume on specific racks, nodes, or groups of nodes by setting its topology to an existing node topology. You can set volume topology using the MapR Control System or with the volume move command.
To set volume topology using the MapR Control System:
- In the Navigation pane, expand the MapR Data Platform group and click the Volumes view.
- Display the Volume Properties dialog by clicking the volume name or by selecting the checkbox beside the volume name, then clicking the Properties button.
- Click Move Volume to display the Move Volume dialog.
- Select a topology path that corresponds to the rack or nodes where you would like the volume to reside.
- Click Move Volume to return to the Volume Properties dialog.
- Click Modify Volume to save changes to the volume.
Setting Default Volume Topology
By default, new volumes are created with a topology of
/data. To change the default topology, use the
config save command to change the
cldb.default.volume.topology configuration parameter. Example:
After running the above command, new volumes have the volume topology
/data/rack02 by default, which could be useful to restrict new volume data to subset of the cluster.
Example: Setting Up CLDB-Only Nodes
In a large cluster (100 nodes or more) create CLDB-only nodes to ensure high performance. This configuration also provides additional control over the placement of the CLDB data, for load balancing, fault tolerance, or high availability (HA). Setting up CLDB-only nodes involves restricting the CLDB volume to its own topology and making sure all other volumes are on a separate topology. Because both the CLDB-only path and the non-CLDB path are children of the root topology path, new non-CLDB volumes are not guaranteed to keep off the CLDB-only nodes. To avoid this problem, set a default volume topology. See Setting Default Volume Topology.
To set up a CLDB-only node:
- SET UP the node as usual:
- INSTALL the following packages to the node.
To set up a volume topology that restricts the CLDB volume to specific nodes:
- Move all CLDB nodes to a CLDB-only topology (e. g.
/cldbonly) using the MapR Control System or the following command:
maprcli node move -serverids <CLDB nodes> -topology /cldbonly
- Restrict the CLDB volume to the CLDB-only topology. Use the MapR Control System or the following command:
maprcli volume move -name mapr.cldb.internal -topology /cldbonly
- If the CLDB volume is present on nodes not in /cldbonly, increase the replication factor of mapr.cldb.internal to create enough copies in
/cldbonlyusing the MapR Control System or the following command:
maprcli volume modify -name mapr.cldb.internal -replication <replication factor>
- Once the volume has sufficient copies, remove the extra replicas by reducing the replication factor to the desired value using the MapR Control System or the command used in the previous step.
To move all other volumes to a topology separate from the CLDB-only nodes:
- Move all non-CLDB nodes to a non-CLDB topology (e. g.
/defaultRack) using the MapR Control System or the following command:
maprcli node move -serverids <all non-CLDB nodes> -topology /defaultRack
Restrict all existing volumes to the topology
/defaultRackusing the MapR Control System or the following command:
maprcli volume move -name <volume> -topology /defaultRack
All volumes except
mapr.cluster.rootare re-replicated to the changed topology automatically.
To prevent subsequently created volumes from encroaching on the CLDB-only nodes, set a default topology that excludes the CLDB-only topology.