The MapR Control System (MCS) is a graphical, programmatic control panel for cluster administration that provides complete cluster monitoring functionality and most of the functionality of the command line. This is Part 2 of the three-part series on MCS tutorials that talks about setting up Volumes, Snapshots and Mirrors using MCS.
Use the tutorials to perform the following operations in the MCS:
Part 1 (Click here)
Part 2 (This tutorial)
Part 3 (Click here)
A volume is a logical unit that you create to organize data into groups to manage your data and apply policy all at once instead of file by file. The volume structure defines how data is distributed across the nodes in your cluster.
You can create volumes for each user, department, or project. Volumes can enforce disk usage limits, set replication levels, establish ownership and accountability, and measure the cost generated by different projects or departments.
Configure volumes as soon as you can after getting your cluster up and running. Putting all your data in the cluster without organizing it into volumes can lead to headaches later. It is important to create many volumes for data storage and to select your choice of volumes strategically for management. Volumes are easily created, named, and their mount path designated from the MCS.
Volumes empower the following data management features that MapR provides:
A MapR cluster comes with certain system volumes out of the box. The following diagram shows the system volumes (blue) along with recommended volumes that you should add to your new cluster.
The root volume (
mapr.cluster.root, mounted at /) contains the mount points for the other volumes. MapR provides a volume for HBase (if installed) and a
/var/mapr volume containing information about cluster configuration. There is also a local volume for each node - limited by its topology to reside only on its own node.
As shown in the example above, you should add a hierarchy of volumes for users, projects and departments, to enable you to manage data for these different entities separately.
Create a Volume
Click New Volume.
Specify volume settings:
johnsmithwith a mount path of
/users/jsmithfor example. You can also set volume topology here (default is
/dataof course, to use all racks), and choose whether to create a normal read/write volume or a mirror volume.
For more information on what these settings mean, see Managing Data with Volumes.
A snapshot is a read-only image of a volume at a specific point in time. Snapshots are useful any time you need to roll back to a known good data set at a specific point in time. You can create a snapshot manually or automate the process with a schedule. If you want to automate the snapshot with a schedule, configure schedule details first.
Create a snapshot manually:
Create a snapshot schedule:
Define one or more schedule rules in the Schedule Rules section:
a. From the first dropdown menu, select a frequency (Once, Yearly, Monthly, etc.)
b. From the next dropdown menu, select a time point within the specified frequency. For example: if you selected Monthly in the first dropdown menu, select the day of the month in the second dropdown menu.
c. Continue with each dropdown menu, proceeding to the right, to specify the time at which the scheduled action is to occur.
Use the Retain For field to specify how long the data is to be preserved. For example: if the schedule is attached to a volume for creating snapshots, the Retain For field specifies how far after creation the snapshot expiration date is set.
Schedule a snapshot:
A mirror volume is a read-only physical copy of a source volume. You can use mirror volumes in the same cluster (local mirroring) to provide local load balancing. Local mirror volumes can serve read requests for the most frequently accessed data in the cluster. You can also mirror volumes on a separate cluster (remote mirroring) for backup and disaster readiness purposes.
Create a local mirror volume:
Create a remote mirror volume: