Setting Up Disks for MapR

This section describes how to set up disks during the normal installation process. Go to the disksetup command page for information about other uses of this command.

MapR formats and uses disks for the Lockless Storage Services layer (MapR Filesystem), and records these disks in the file disktab. In a production environment, or when testing performance, MapR should be configured to use physical hard drives and partitions. In some cases, it is necessary to reinstall the operating system on a node so that the physical hard drives are available for direct use by MapR. Reinstalling the operating system provides an unrestricted opportunity to configure the hard drives. If the installation procedure assigns hard drives to be managed by the Linux Logical Volume Manager (LVM) by default, you should explicitly remove the drives you plan to use with MapR from the LVM configuration. It is common to let LVM manage one physical drive containing the operating system partition(s) and to leave the rest unmanaged by LVM for use with MapR.
Note: It is not necessary to set up RAID (Redundant Array of Independent Disks) on disks used by MapR Filesystem. MapR uses the disksetup script to set up storage pools. In most cases, you should let MapR calculate storage pools using the default stripe width of two or three disks. If you anticipate a high volume of random-access I/O, you can use the -W option with disksetup to specify larger storage pools of up to 8 disks each.
The following procedures are intended for use on physical clusters or Amazon EC2 instances. On EC2 instances, EBS volumes can be used as MapR storage, although performance will be slow.
Note: If you are using MapR on Amazon EMR, you do not have to use this procedure; the disks are set up for you automatically.