Creating a Volume

You can create a new (Standard or Mirror) volume using MCS and/or the CLI.

Creating a Volume Using the MapR Control System

To create a new (Standard or Mirror) volume using MCS:

  1. Go to Data > Volumes and click Create Volume.
    The Create New Volume page displays.
  2. Select the Volume Type under SETTINGS in the Properties tab. Choose:
    • Standard Volume to create a read-write volume.
    • Mirror Volume to create a volume that is a read-only copy of a source volume.
      Tip: See also: Mirror Types.
  3. Specify the following required settings in the Properties tab:
    Volume Name Enter a name for the volume. A volume name can include alphanumeric characters and the following characters: . (dot), _ (underscore), and - (dash).
    Note:
    • The volume name should not begin with mapr. because mapr. is used for system volumes. If you use mapr. at the start of the volume name, the volume may not display in the default view of the list of volumes in MCS; you must select the Include System Volumes checkbox in the Volumes pane to view volumes with names beginning with mapr.
    • For tiering-enabled volumes, volume name should not exceed ninety-eight characters.
    Accountable Entity Select the type, user or group, of accountable entity and enter the name of the entity in the associated text field.
    Steps 4 to 8 are optional and allow you to define optional volume properties (for auditing, replication, and data tiering), volume access, and volume quota. If these are not defined, default values, where available, are used. You can skip to:
    • (Optional) Step 9 to associate a snapshot schedule and/or an offload schedule with the volume.
    • Step 10 to create the volume with basic settings.
    Volume Name Enter a name for the volume. A volume name can include alphanumeric characters and the following characters: . (dot), _ (underscore), and - (dash).
    Note: Do not begin the volume name with mapr. as mapr. is used for system volumes. If you use mapr. at the start of the volume name, the volume may not display in the default view of the list of volumes in MCS; you must select the Include System Volumes checkbox in the Volumes pane to view volumes with names beginning with mapr..
    Source Volume Cluster Enter the name of the cluster on which the source volume exists. Mirroring only works between two secure clusters or between two non-secure clusters. Mirroring does not work when one cluster is secure and the other is non-secure.
    Note: When setting up mirror volumes for mirroring between clusters, for the mirroring operation to run successfully, servers in one cluster cannot use the same IP addresses as servers in the other cluster. For example, if node A in cluster A has a private IP address of 10.10.20.29, no server in cluster B can have the same private IP address. Also, all the servers in destination cluster must be able to reach all the servers in the source cluster and vice versa. For example, suppose 10.10.20.29 is the only IP address used by node A in cluster A; then all servers in cluster B should be able to reach the IP address 10.10.20.29.
    Source Volume Name Enter the name of the source volume, from which the mirror volume pulls data, (after selecting the source volume cluster). If the source volume is on:
    • Same cluster, you will create a local mirror volume, which is useful for load balancing or for providing a read-only copy of a data set. See Local Mirroring for more information.
    • Another cluster, you will create a remote mirror volume, which is useful for offsite backup, for data transfer to remote facilities, and for load and latency balancing. See Creating Remote Mirrors for more information.
    Note: If you plan to enable tiering for the mirror volume, ensure that the selected source volume is also tiering-enabled. You cannot create tiering-enabled mirror volumes to mirror data in standard volumes that are not tiering-enabled.
    For information on setting up mirror cascades, see Mirror Cascades.
    Accountable Entity Select the type, user or group, of accountable entity and enter the name of the entity in the associated text field.

    Steps 4 to 8 are optional and allow you to define optional volume properties (for auditing and replication), volume access, and volume quota. If these are not defined, default values, where available, are used. You can skip to:

    • (Optional) Step 9 to associate a mirroring schedule, a snapshot schedule, and/or an offload schedule with the mirror volume.

      The mirroring schedule specifies how often to run the mirroring operation, which copies data (that has changed at the file-block level since the last data transfer) directly from the source, even if the files in the source volume are being written to or deleted.

      The snapshot schedule specifies how often to take a snapshot of the mirror volume for the purpose of preserving the state of the mirror before a subsequent mirroring operation.

      The offload schedule specified how often to offload data in the volume to the configured tier based on the criteria in the storage policy.

    • Step 10 to create the mirror volume with basic settings.
  4. (Optional) Specify the following general and auditing settings under Properties:
    Mount Specifies whether to automatically mount (Yes) or not mount (No) the volume after creation. By default, volumes are mounted immediately after creation. If this is set to Yes, you must also specify the mount path.
    Mount Path The path to mount the volume. This is required if the value for Mount is Yes.
    Note: The path must be relative to / and cannot be in the form of a global namespace path (for example, /mapr/<cluster-name>/).
    Data Tier Specifies whether (Yes) or not (No) to enable data tiering for volume.
    Collect Metrics Specifies whether (Yes) or not (No) to enable metrics collection for this volume. For more information, see Collecting Volume Metrics and Enabling Volume Metric Collection.
    Volume Access Specifies whether volume is read-only or read/write volume. Data on a read-only volume can only be read and data can both be read from and written to a read/write volume. By default, a standard volume is created with read/write access. A mirror volume can only be a read-only volume.
    Enable Auditing Select one of the following:
    • Disable auditing
    • Enable auditing for volume so that auditing can selectively be enabled for directories, files, tables, and streams in the volume
    • Enable auditing of operations on all files, tables, and streams in the volume whether or not auditing is enabled for files, tables, and streams in the volume.
    Coalesce Interval The interval of time (in minutes) to use when logging multiple READ, WRITE, or GETATTR operations on one file from one client IP address, if auditing is enabled. The default value is 60 minutes.
    Data on Wire Encryption Specifies whether (Yes) or not (No) to encrypt data in the volume during transmission. By default, the value is Yes for all new volumes in secure cluster. This is not supported on unsecure clusters.
    Data at Rest Encryption Specifies whether (Yes) or not (No) to encrypt data-at-rest. This is not supported on unsecure clusters.
  5. (Optional) Specify the following under Properties for volume (hot) data replication.
    Topology The rack path to the volume. The default topology is /data.
    Optimize Replication for The basis for the replication factor:
    • High throughput, or cascading replication, where volumes are replicated sequentially on intermediate and tail containers.
    • Low latency, or star replication, where volumes are replicated on multiple containers in parallel.
    The default value is high throughput. See Selecting a Replication Type for High Availability.
    Replication The minimum (Minimum Replication) and desired (Target Replication) number of copies for the volume data. The default minimum is 2 and the default target is 3.
    Name Container Replication The (Minimum Replication) and desired (Target Replication) number of copies for the name container associated with the volume. The default minimum is 2 and the default target is 3.
    Guarantee Min Replication Specifies whether (Yes) or not (No) to enforce minimum number of copies. If this is enabled (Yes), writes succeed only when the minimum number of copies exist. If this is enabled (Yes) and minimum number of copies are not available, the client is asked to retry.

    For more information, see Understanding Replication.

  6. (Optional) If data tier is enabled (Yes), select one of the following, and set values for associated properties.
    For offloading data to an erasure coded volume, specify values for the following properties. If values are not specified, default values are applied.
    Topology The topology of the erasure coded volume from the dropdown list.
    Storage Policy The rule for offloading data in this volume. You can click: If you do not select a storage policy, the default policy named default.ectier.rule, which is all files (p), is associated with the volume.
    Erasure Coding The erasure coding scheme, which is the number of data chunks and number of parity chunks. You can select one of the following from the Scheme dropdown menu:
    • 3,2 — for 3 data chunks and 2 parity chunks. You must have 5 or more nodes on the cluster for this option. If selected, this has 60% storage overhead and can tolerate failure of up to 2 nodes.
    • 4,2 — for 4 data chunks and 2 parity chunks. You must have 6 or more nodes on the cluster for this option. If selected, this has 50% storage overhead and can tolerate failure of up to 2 nodes.
    • 5,2 — for 5 data chunks and 2 parity chunks. You must have 7 or more nodes on the cluster for this option. If selected, this has 40% storage overhead and can tolerate failure of up to 2 nodes.
    • 6,3 — for 6 data chunks and 3 parity chunks. You must have 9 or more nodes on the cluster for this option. If selected, this has 50% storage overhead and can tolerate failure of up to 3 nodes.
    Note: Although you can create a volume even if the required number of nodes are not present, offload operation will fail if the required number of nodes are not present.
    See Selecting an Erasure Coding Scheme for more information.
    For offloading data to a low cost storage alternative on the cloud, specify values for the following properties.
    Remote Target The location to offload data to. You can click:
    Storage Policy The rule for offloading data in this volume. You can click:
    Retention Duration after Recall The number of days to retain data recalled from the tier to the MapR cluster. Once the number of days is reached, recalled data on the MapR cluster is purged (if there are no changes) or offloaded (if there are changes).
    Tier Encryption Specifies whether (Yes) or not (No) to enable encryption of data on the tier. This cannot be modified once it is set. The default value is No (disabled).
  7. (Optional) Specify the following volume (administrative and user) access control settings in the Authorization tab:
    ADMINISTRATIVE CONTROLS The users and/or groups that have one or more of the following permissions:
    Dump & Backup
    Transport large amount of data or copies of the volume on physical media to a remote cluster using backup files.
    Restore & Mirror
    Restore a volume from a dump file and create mirror volumes, which is a read-only copy of the source volume.
    Edit
    Edit volume properties, create and delete snapshots.
    Delete
    Delete the volume.
    Admin
    View and edit access control settings (but cannot perform volume operations).
    Full Control
    Perform all volume-related administrative operations except changing access control settings.
    To define administrative access control settings for another user or group, click Add Another button.
    Note: To perform this action from the command line, refer to acl set.
    By default, the root user and the volume creator have full control permissions on the volume.
    Root Directory Permission
    Set read, write, and/or execute permissions on the root directory for users, groups and others.
    USER ACCESS CONTROLS SThe users, groups, and/or roles that have and/or do not have access to read and/or write to the volume.
    To grant or block access to users, groups, and/or roles, from the:
    • Basic settings, select the type — public, (OR) user, group, or role — from the drop-down menu, specify the name of the user, group, or role, and select one or more checkbox to grant permissions.
      Tip: Click to create a copy of the associated access control setting. Click to remove the associated access control expression.
      To add access control expressions for another user, group, or role, click Add Another and repeat this step.
    • Advanced settings, or specify public (p) or user (u), group (g), and/or role (r) who have and/or do not have the type of access using the following boolean expressions and subexpressions:
      • ! — Negation operator.
      • & — AND operation.
      • | — OR operation.
      Use (), parentheses, for subexpressions.
      Note: You cannot specify user, group, or role individually if access is granted to all users (public).

      Alternatively, click associated with the type of access to use the Access Control Expression window to define access for public or users, group, and/or role. See Defining ACEs for more information.

    Note: If you switch from Basic to Advanced, the basic settings, if any, will be carried over to the advanced settings. If you switch from Advanced to Basic, all the settings will be lost because the subexpressions and AND (&) and negation (!) operations that are supported by advanced settings are not supported in the basic settings.
    By default, access is granted to all users (public). If access is granted to all users (public), access cannot also be granted individually to users, groups, and/or roles.
  8. Define the amount of disk space (optional) in megabytes (MB), gigabytes (GB), or terabytes (TB) to allocate to the volume in the Usage tab:
    By default, there is no limit on the disk space.
    Warning: If the quota set for the mirror volume is less than the quota set for its source volume, the CLDB raises an alarm. The mirroring operation will not fail, but you must decide whether to add space and increase the mirror volume quota, or remove unwanted space from the source volume and decrease its quota.
    Advisory Quota The limit, which when reached, an alarm is raised.
    Hard Quota The limit, which when reached, no further writes are allowed.
    Note: When you set a hard quota for a tiering-enabled volume, the quota is the total space allocated for the volume irrespective of the location (cluster or tier) where the volume data is stored. For example, if you allocate 1GB of hard quota for a tiering-enabled volume, writes will fail after you write 1GB of data whether or not the volume data is local (on the cluster) or offloaded (to the tier).
    Note: The value for advisory quota cannot be greater than the value for hard quota.
    The graph shows the currently used (across volumes), reserve limit (which is 90% of the total disk space on the cluster), and total amount of disk space at the cluster level.
  9. (Optional) Select the schedule(s) to associate with the volume in the Schedule tab.
    You can click Browse to select existing or click Create to create new schedule for the following:
    Snapshot Schedule The snapshot schedule specifies when snapshots should be automatically created.
    Note: If this is not defined, snapshots will not be created for this volume.
    Each snapshot created by a schedule has an expiration date that determines how long the snapshot is retained.
    Tier Offload The tier offload schedule specifies the frequency for automatically offloading data in the volume to the configured tier. This is only available for tiering enabled volumes.

    You can click Browse to select existing or click Create to create new schedule for the following:

    Snapshot Schedule The snapshot schedule specifies how often to take a snapshot of the mirror volume for the purpose of preserving the state of the mirror before a subsequent mirror operation. This way, if corrupt data is copied from the source volume's snapshot into the mirror volume, the mirror contents can be rolled back to the snapshot. Each snapshot created by a schedule has an expiration date that determines how long the snapshot is retained.
    Mirror Schedule The mirror schedule specifies how frequently the mirror volume must be synchronize with the source volume. In case of a disaster (or any type of data loss on a read-write source volume), the data can be recovered from the mirror volume, but any data written to the source volume since the last successful mirror operation will not be on the mirror volume. Therefore, you should set the mirror schedule such that it meets your RPO (Recovery Point Objective).
    Tip: For best performance, select a mirroring schedule according to the anticipated rate of data changes and the available bandwidth for mirroring. See also: Guidelines for Setting Mirror Schedules.
    Tier Offload The tier offload schedule specifies the frequency for automatically offloading data in the volume to the configured tier. This is only available for tiering enabled volumes.
    When selecting a schedule, you can choose your custom schedule, if it exists, or one of the (following) defaults:
    • Critical data
    • Important data
    • Normal data
    • Automatic Tiering Scheduler (available only for tiering-enabled volumes)
  10. Click Create Volume to create the volume.

Creating a Volume Using the CLI or the REST API

The basic command to create a volume is:

maprcli volume create -name <name>

You can create a standard read-write volume, a (local or remote) mirror volume, and/or a tiering enabled standard read-write and/or a (local or remote) mirror volume.

The basic command to create a standard volume is:

maprcli volume create -name <volume name> -path <volume path> -type rw

When creating and mounting a volume, the location of the mount point can be specified using the path parameter. The volume that is last in the path parameter is referred to as the parent volume. (The parent volume is the volume on which the volume link is created.)

For complete reference information, see volume create.

  1. Log in to the node on the cluster where you want to create the mirror.
  2. Use the volume create command to create the mirror volume.

    Specify the source volume name, provide a name for the mirror volume, and specify the type as mirror. Optionally, associate a schedule with the volume. For example:

    maprcli volume create -name volume-a -source volume-a -type mirror -schedule 2
    For complete reference information, see volume create.
  1. Log in to the node on the cluster where you wish to create the mirror.
  2. Use the volume create command to create the mirror volume.

    Specify the source volume and cluster in the format <volume>@<cluster>, provide a name for the mirror volume, and specify the type as mirror. Optionally, associate a schedule with the volume. For example:

    maprcli volume create -name volume-a -source volume-a@cluster-1 -type mirror -schedule 2

    When creating and mounting a volume, the location of the mount point can be specified using the path parameter. The volume that is last in the path parameter is referred to as the parent volume. (The parent volume is the volume on which the volume link is created.)

    For complete reference information, see volume create.
  3. Ensure that each cluster can resolve the nodes in the other cluster.
    See:
The basic command to create a tiering-enabled volume is the following:
$maprcli volume create -name <volName> -path <volmountpath> -tieringenable true 

For the complete list of supported parameters, see volume create.