The MapR quick installer automates cluster deployment.
The nodes in a MapR cluster can be one of the following types:
Control nodes manage the operation of the cluster. Control nodes host the ZooKeeper, CLDB, JobTracker, and Webserver services.
Data nodes store and process data using Hadoop ecosystem tools such as MapReduce, Hive, or MapR Tables.
Dual nodes combine control and data node functionality.
Client nodes provide controlled user access to the cluster.
For more information about node types, see Node Types.
Before You Start
Determine how many control nodes your cluster will have. The MapR installer supports one or three control nodes. Three control nodes are typically sufficient for clusters up to approximately 100 nodes.
Ensure that each node in your cluster has access to the internet. If each node does not have access to the internet, complete an advanced installation.
Determine which nodes in your cluster will perform as data or client nodes. The MapR installer supports an arbitrary number of data or client nodes.
For each node in the cluster, identify which disks you want to allocate to the MapR file system. If the same set of disks and partitions applies for all nodes in the cluster, you can use interactive mode for the installer. To specify a distinct set of disks and partitions for individual cluster nodes, you need to use a configuration file. The installer’s interactive mode and configuration files are discussed in depth later in this document.
For more information and guidelines about the MapR installation process, see About Installation.
Quick Installer Requirements
The quick installer runs the following operating systems
RedHat Enterprise Linux (RHEL) or Community Enterprise Linux (CentOS) version 6.1 and later that have the EPEL repository installed.
Ubuntu Linux version 12.04
The quick installer install MapR on nodes that meet the following requirements:
Python 2.6 or later must be installed.
The operating system must be one of the following:
CentOS/Red Hat 6.1 or later
SuSE 11 or later
The operating system on each node must meet the quick installer package dependencies.
Operating System Package Dependencies Ubuntu
Before You Install
You can install the MapR distribution for Hadoop on a set of nodes from any machine that can connect to the nodes. The machine you install from does not need to be one of the cluster nodes. The following steps set up the installing machine:
- Download the
mapr-setupfile from one of the following URLs: For an Ubuntu installation, http://package.mapr.com/releases/v3.1.1/ubuntu/
For a Red Hat or CentOS installation, http://package.mapr.com/releases/v3.1.1/redhat/
The following example uses the
wgetutility to download the
mapr-setupfile for an Ubuntu installation:
- Navigate to the directory where you downloaded the
mapr-setupfile and enable execute permissions with the following command: $ chmod 755 mapr-setup
mapr-setupfrom the directory where you downloaded it to unpack the installer files to the
/opt/mapr-installerdirectory. The user running
mapr-setupmust have write access to the
/tmpdirectories. Alternately, execute
mapr-setupwith sudo privileges, as in the following command: $ sudo ./mapr-setup
You are now ready to install.
Using the MapR Quick Installer
You can use the MapR quick installer in interactive mode from the command line or provide a configuration file. Details about the format and syntax of the configuration file are provided later in this document.
Before you begin installing, verify that all the nodes are configured to have the same login information. If you are using the quick installer in interactive mode, described later in this document, verify that all of the nodes have the same disks for use by the MapR Hadoop Platform.
Installing from the Command Line with Interactive Mode
The default invocation of the MapR quick installer requires the root user or sudo privileges, as in the following example:
# sudo /opt/mapr-installer/bin/install -K -s new
For more information on the syntax and options for the quick installer, see the Quick Installer Options section later in this document.
Interactive Mode Sample Session
The following output reflects a typical interactive-mode session with the MapR quick installer. User input is in bold.
Verifying install pre-requisites
= __ __ ____ ___ _ _ _ =
= | \/ | __ _ _ __ | _ \ |_ _| _ __ ___ | |_ __ _ | || | ___ _ __ =
= | |\/| | / _` || '_ \ | |_) | | | | '_ \ / __|| __|/ _` || || | / _ \| '__|=
= | | | || (_| || |_) || _ < | | | | | |\__ \| |_| (_| || || || __/| | =
= |_| |_| \__,_|| .__/ |_| \_\ |___||_| |_||___/ \__|\__,_||_||_| \___||_| =
= |_| =
An Installer config file is typically used by experienced MapR admins to skip through the interview process.
Do you have a config file (y/n) [n]: n
Enter the hostnames of all the control nodes separated by spaces or commas : control-host-01,control-host-02,control-host-03
Enter the hostnames of all the data nodes separated by spaces or commas :
Set MapR User Name [mapr]:
Set MapR User Password [mapr]:
Is this cluster going to run MapReduce? (y/n) [y]:
Is this cluster going to run Apache HBase? (y/n) [n]:
Is this cluster going to run MapR M7? (y/n) [y]:
Note: MapR Tables require the M7 license level.
Enter the full path of disks for hosts separated by spaces or commas : /dev/sdb
Once you’ve specified the cluster’s configuration information, the MapR quick installer displays the configuration and asks for confirmation:
Current Information (Please verify if correct)
Cluster Name: "my.cluster.com"
MapR User Name: "mapr"
MapR Group Name: "mapr"
MapR User UID: "2000"
MapR User GID: "2000"
MapR User Password (Default: mapr): "****"
WireLevel Security: "n"
MapReduce Services: "y"
MapR M7: "y"
Disks to use: "/dev/sdb"
Client Nodes: ""
Control Nodes: "control-host-01,control-host-02,control-host-03"
Data Nodes: ""
Repository (will download core software from here): "http://package.mapr.com/releases"
Ecosystem Repository (will download packages like Pig, Hive etc from here): "http://package.mapr.com/releases/ecosystem"
MapR Version to Install: "3.1.1"
Java Version to Install: "OpenJDK7"
Allow Control Nodes to function as Data Nodes (Not recommended for large clusters): "n"
Metrics DB Host and Port: ""
Metrics DB User Name: ""
Metrics DB User Password: ""
Metrics DB Schema: ""
(c)ontinue with install, (m)odify options, or save current configuration and (a)bort? (c/m/a) [c]: m
At this point you are ready to continue with installation.
Here is the complete list of configuration properties you can change:
Pick an option to modify
N] Cluster Name: "my.cluster.com"
u] MapR User Name: "mapr"
g] MapR Group Name: "mapr"
U] MapR User UID: "2000"
G] MapR User GID: "2000"
p] MapR User Password: "****"
S] WireLevel Security: "n"
d] Disk Settings: "/dev/sdb"
c] Client Nodes: ""
C] Control Nodes: "control-host-01,control-host-02,control-host-03"
D] Data Nodes: ""
b] Control Nodes to function as Data Nodes: "n"
v] Version: "3.1.1"
L] Local Repository: "False"
mr] MapReduce: "y"
m7] MapR M7: "y"
hb] HBase: "n"
uc] Core Repo URL: "http://package.mapr.com/releases"
ue] Ecosystem Repo URL: "http://package.mapr.com/releases/ecosystem"
dbh] Metrics DB Host and Port: ""
dbu] Metrics DB User: ""
dbp] Metrics DB Password: ""
dbs] Metrics DB Schema: ""
(c)ontinue with install, (m)odify options, or save current configuration and (a)bort? (c/m/a) [c]: cSUDO Username: root
SSH Username: juser
SSH password: ****
sudo password [defaults to SSH password]: ****
The quick installer first sets up the control nodes in parallel, then sets up data nodes in groups of ten nodes at a time. Pre-requisite packages are automatically downloaded and installed by the MapR quick installer.
Quick Installer Options
While all the options to the MapR quick installer are optional, if you use any options, you must follow them with either the new or the add parameters to specify a new installation or an addition to an existing installation.
mapr-install [-h] [-s] [-U SUDO_USER] [-u REMOTE_USER]
[--private-key PRIVATE_KEY_FILE] [-k] [-K]
[--skip-checks] [--quiet] [--cfg CFG_LOCATION]
[--debug] [--password REMOTE_PASS]
-h or --help
Displays help text.
-u or --user <remote user>
Specifies a user name that the MapR quick installer uses to connect to the cluster nodes.
-k or --ask-pass
Request the remote ssh password interactively.
Specifies the remote ssh user’s password. Note: You cannot use this option if you are specifying a private key with the --private-key option.
--private-key <path to private key file>
Specifies a path to a private key file used to authenticate the connection. Note: You cannot use the --password option if you are specifying a private key.
-s or --sudo
Executes operations on the target nodes using sudo. If the user specified with the -u option is not root, you must use this option.
-U or --sudo-user <sudo user>
Specifies the user name of the sudo user. This user name is root on most systems.
-K or --ask-sudo-pass
Request the sudo password interactively.
Specifies the sudo user’s password.
Skips requirements pre-checks.
Runs the installer in a non-interactive mode.
--cfg <path to config file location>
Install with the configuration file at the specified path.
Run in debug mode. Debug mode includes more verbose reports on installer activity.
The MapR Quick Installer Configuration File
The example file
config.example in the
/opt/mapr-installer/bin directory shows the expected format of an installation configuration file.
# Each Node section can specify nodes in the following format
# Node: disk1, disk2, disk3
# Specifying disks is optional, in which case the default disk information
# from the Default section will be picked up
control-01: /dev/disk1, /dev/disk2, /dev/disk3
control-02: /dev/disk3, /dev/disk9
control-03: /dev/sdb, /dev/sdc, /dev/sdd
data-02: /dev/sdb, /dev/sdc, /dev/sdd
data-04: /dev/sdb, /dev/sdd
MapReduce = true
YARN = false
HBase = false
M7 = true
ControlNodesAsDataNodes = true
WirelevelSecurity = false
LocalRepo = false
ClusterName = my.cluster.com
User = mapr
Group = mapr
Password = default_mapr_password
UID = 2000
GID = 2000
Disks = /dev/sdz
Version = 3.1.1
For a new installation, all of the sections must be present in the configuration file, though the
[Client_Nodes] sections can be left empty. For additions to an existing installation, the
[Client_Nodes] must be present, although they can be left empty. Other sections in the configuration file are silently ignored for additions.
The value of the
Disks element of the
[Default] section provides a fallback in the case that a node is specified in a previous
[Client_Nodes] section without any disk information.
You can omit specifying values for the keys in the
[Default] section, but each of the keys must be present.
The Quick Installer Manifest File
The MapR quick installer generates a manifest file in the
/opt/mapr-installer/var directory named
manifest.yml. The manifest file stores your cluster’s installation state. When you specify the
Since the manifest file is generated on the node from which you installed MapR, you must run the quick installer from the same node if you are perfoming an addition to an existing installation. Since new installations do not reference a manifest file, new installations can be performed from any node.
The Quick Installer fails with permissions errors: Many Ubuntu systems disable the root login for security reasons.
Resolution: Start the quick installer with the following options:
# sudo /opt/mapr-installer/bin/install -u <user> -s -U root [--sudo-password <password> | --ask-sudo-pass] new
You can must use exactly one of the
--ask-sudo-pass options. The
--sudo-password option requires you to type the sudo password in the command line. The
--ask-pass option requests the sudo password interactively.
Client disconnection disrupts my installation process: To prevent issues with client disconnection from affecting the install process, run the MapR quick installer from a screen or tmux session.
Using the MapR Quick Installer on a cloud installation: Cloud computing services assign you a private key for use with your cloud computing nodes. Typically, private key files use the .pem extension. To use this private key with the MapR quick installer, verify that the permissions for the file are 0600 (-rw-------). You can use the chmod command to set the permissions, as in the following example:
$ chmod 0600 filename.pem
Once the file has the correct permissions, specify the path to the private key file with the --private-key option.
The installer hangs at the ‘Configuring MapR Software’ step: The installer reports its activity with output similar to the following example:
* 16:25:31 Install OpenJDK packages
* 16:27:42 MapR Repository Initialization
* 16:27:42 MapR Repository Initialization for RedHat
* 16:28:27 Install MapR Packages
* 16:29:04 Disable MapR Services until configuration
* 16:29:05 Configure MapR software
One potential cause of this error condition is that the MapR user specified already exists on one of the nodes. In this case, the installer does not overwrite the credentials for that existing user and cannot authenticate to that node.
Resolution: Examine the log files to determine the precise cause of the error.
The apt-get utility fails with a ‘cannot get lock’ error message: The MapR Quick Installer requires root privileges. When root privileges are not available, this error message can result.
Resolution: Check the sudo or sudo-user settings on the cluster nodes, then run the MapR Quick Installer with the -u <user> -s -U root -K new flags, as in the following example:
# sudo /opt/mapr-installer/bin/install -u <user> -s -U root -K new
To complete the post installation process, follow these steps:
Access the MCS by entering the following URL in your browser, substituting the IP address of a control node on your cluster:
Compatible browsers include Chrome, Firefox 3.0 and above, Safari (see Browser Compatibility for more information) and Internet Explorer 10 and above.
- If a message about the security certificate appears, click Proceed anyway.
- Log in with the MapR user name and password that you set during the installation.
- To register and apply a license, click Manage Licenses in the upper right corner, and follow the instructions to add a license via the web.
See Managing Licenses for more information.
- Create separate volumes so you can specify different policies for different subsets of data. See Managing Data with Volumes for more information.
- Set up topology so the cluster is rack-aware for optimum replication. For more information, see Setting Up Topology.