To make it possible for users to provision clusters, you need to set up the following objects:
Images of the MapR Distribution
A “flavor” optimized for MapR, which defines the resource allocation for an instance
Node group templates that describe attributes of a set of cluster nodes, including which services a node will run
Cluster templates that describe attributes of clusters that can be deployed, with reference to existing node group templates
The following setup is typically done via the OpenStack Dashboard, which should already be running in your OpenStack environment.
After starting the Sahara service, you will be able to access new Sahara functionality via the dashboard. If you prefer, you can use CLI commands instead of the UI for certain steps, as documented here:
Log into the Dashboard by going to:
http://<ip_address>, using the IP address for your OpenStack node.
Step 1: Adding MapR Images to OpenStack
Sahara uses the Glance image service to maintain a list of all of the software images that are available to the current user. (Images are stored in the Swift component of OpenStack.)
- On the Dashboard, select Project > Compute > Images.
- Click the Create Image button in the top-right corner of the screen.
- Fill out the Create an Image screen:
Type a name for the image.
Under Image Source, select Image File.
Under Image File, choose one of the prebuilt images.
Under Format, select QCOW2 – QEMU Emulator.
Enter the Architecture (optional; for example: x86_64).
Enter the Minimum Disk and Minimum RAM size (optional).
- Select Public and Protected.sdf.
Using the CLI to Create Images
Alternatively, you can use Sahara CLI commands to create images. For example:
MapR distribution images for Ubuntu and CentOS can be found at the following locations:
Checking Security Group Roles
If necessary, increase security group rules to 30 or more. (The MapR plugin requires a larger number than the default 20 that may bet set in your environment.)
On the dashboard, go to:
Admin > Defaults > Update Defaults > Security Group Rules:
In order to change Defaults, you will need administrator privileges.
Step 2: Creating a Flavor
Create a flavor that is optimized for MapR deployments. Use the following recommended settings, as shown in the example:
VCPUs: 2 or more
RAM (MB): 8192 or more
Root Disk (GB): 15 or more
Ephemeral Disk (GB): 10 or more
Swap Disk (MB): 8192 or more
In order to create or modify flavors, you will need administrator privileges.
Step 3: Registering Images for the MapR Plugin
Sahara users who want to provision clusters have to specify additional properties for images that were previously added to the Glance database, including specific “plugin tags” for the images. These tags associate images with the MapR plugin and a specific version of the MapR Distribution.
Use the Image Registry to register images for use with the MapR plugin.
Go to Project > Data Processing > Image Registry > Register Image.
- Select the image from the Image list.
cloud-userfor CentOS or
ubuntufor Ubuntu in the User Name field.
Select and add Plugin and Version tags, then click the Add plugin tags button, as shown below.
Step 4: Creating MapR Node Group Templates
The next step is to create node group templates. These templates describe the type of workload for a node in a cluster, in terms of the MapR and ecosystem processes that will run and the available resources (as defined by the flavor you already created).
Go to Project > Data Processing > Node Group Templates > Create Template.
Select MapR Hadoop Distribution and the appropriate version, then click Create:
On the Create Node Group Template screen, enter a descriptive template name and select the MapR flavor that you defined earlier.
Note: Do not use spaces in the template name.
A simple configuration might provide three templates, for Worker (data) nodes, Master (control) nodes, and edge nodes. A different set of processes will run on each group of nodes.
FileServer is required for all node groups. Make sure you select the same MapR version for each node group template that will be used in combination with other templates. For MrV2, you need at least one HistoryServer node (as well as ResourceManager and NodeManager nodes). See also Example of a Sahara MapR Configuration.
Note: The dashboard does not provide an Edit mode for templates, so be sure to select the process lists carefully before creating them. (You can copy and delete templates but you cannot modify them.)
Also select the appropriate values for Storage Location, Floating IP pool, and Auto Security Group. Carefully select the processes that you want to run on this group of nodes, then click the Create button. For example, the DataNodeGroup in this example will run only FileServer and NodeManager processes.
Repeat this process and create templates for other node groups.
After selecting any YARN process in the list, you can specify YARN parameters by clicking the tab at the top of the screen. For example, if you select ResourceManager, you can scroll back to the top of the screen and select the YARN Parameters tab.
Enter the properties you want to set for each YARN service. Click Show full configuration to show all available properties.
Step 5: Creating MapR Cluster Templates
The last setup step is to create templates for MapR clusters that users can launch. Define cluster templates by referencing existing node group templates and setting other properties as required.
Go to Project > Data Processing > Cluster Templates > Create Template. Select the MapR plugin name and Hadoop version, then click Create.
On the second screen, enter the name of the template. You can also specify “anti-affinity groups” for processes, which means that these processes may not be launched more than once on a single host.
On the Node Groups tab, select node group templates (click the + sign) and specify the number of nodes per group in the Count column.
On the General Parameters tab, check the Enable MapR-DB option if required. Go to the remaining tabs and select other properties as needed, then click Create.