The Quick Installer is deprecated. Use the MapR Installer instead.

MapR's Quick Install method automates the installation process for you. It is designed to get a small-scale cluster up and running quickly, with a minimum of user intervention. When you run the MapR installer, it checks prerequisites for you, asks you questions about the configuration of your cluster, prepares the system, and installs MapR software. In most cases, the Quick Install method is the preferred installation method.

Review the following table to verify that the Quick Install method is right for you:

Quick InstallExpert Installation Mode

This method is best suited for:

  • small to medium clusters
  • proof of concept testing
  • users who are new to MapR

 

You should only consider performing a manual (expert mode) installation if you:

  • have a very large or complex cluster
  • need granular control of which services run on each node
  • plan to write scripts that pass arguments to configure.sh directly
  • need to install from behind a firewall, or from machines that are not connected to the Internet

See Advanced Installation Topics for more information.

While the Quick Installation Guide provides a high-level view of the installation process, this document provides more detail to help you with your installation. Topics include:

Planning

This section explains how to prepare for the Quick Install process. Note that the installer performs a series of checks automatically (see Installation Process). In addition to these checks, make sure you meet the following requirements:

  • Your nodes either have internet access, or have access to a local package repository containing MapR packages.
  • All the nodes in your cluster can communicate with each other over the network. The installer uses port 22 for ssh. In addition, MapR software requires connectivity across other ports between the cluster nodes. For a list of all ports used by MapR, refer to Services and Ports Quick Reference
  • Each node meets the requirements outlined in Preparing Each Node.

Understanding Node Types

The MapR installer categorizes nodes as control nodes, data nodes, control-as-data nodes (which combine the functions of control and data nodes), or client nodes. Clusters generally consist of one, three, or five control nodes and an arbitrary number of data or client nodes. 

The following table provides the function of each node type with some additional details:

Node TypeDescription
control nodeManages the cluster and has cluster management services installed. To simplify the installation process, all control nodes have the same services installed on them. In Expert Mode, you can configure each node so these management services are split across nodes. See Advanced Installation Topics for more information.
data nodeUsed for processing data, so they have the FileServer and TaskTracker services installed. If you run MapR-DB or HBase on a data node, the HBase Client service is also installed. Data nodes are used for running YARN applications and MapReduce jobs, and for storing file and table data. These nodes run the FileServer service along with NodeManager (for YARN nodes), TaskTracker (for MapReduce nodes), and HBase client (for MapR-DB and HBase nodes).
control-as-data nodeActs as both control and data nodes. They perform both functions and have both sets of services installed. These nodes are appropriate only for small clusters. Control-as-data nodes act as both control and data nodes. For a single-node cluster, designate the node as control-as-data so it will have control node and data node services installed.
client node

Provides access to the cluster so you can communicate via the command line or the MapR Control System. Client nodes provide access to each node on the cluster so you can submit jobs and retrieve data. A client node can be an edge node of the cluster, your laptop, or any Windows machine. You can install as many client nodes as you want on your cluster. When you specify a client node, you provide the hostname of the initial control node, which establishes communication with the cluster.

You can use the Quick Installer to install the MapR client and the MapR HBase client. The Quick Installer does not install the MapR POSIX client. See MapR POSIX Client for details.

Node Types and Associated Services

The following table shows which services are assigned to each node type. The main services correspond to the core MapR packages, while the additional services are determined by the type of cluster you specify (MapReduce, MapR-DB, HBase, or a combination). See the Installation section of Installing MapR Software under Advanced Installation Topics for more information on these services.

Node TypeYARN Main ServicesCore MapR ServicesAdditional
MapReduce
Services 
Additional
MapR-DB Services
Additional
HBase
Services 
control node

ResourceManager

HistoryServer (on one control node)

CLDB

ZooKeeper

FileServer

NFS

Webserver

Metrics

JobTracker 

HBase

HBase Master

data nodeNodeManager

FileServer

 

 

TaskTrackerHBase Client

HBase Client

HBase Region Server

control
as data

ResourceManager

HistoryServer

NodeManager

CLDB

ZooKeeper

FileServer

NFS

Webserver

Metrics

JobTracker

TaskTracker

HBase Client

HBase Client

HBase Master

HBase Region Server

client node  MapR ClientMapR ClientHBase Client

Cluster Planning Guidelines

To help you plan your cluster, here are some scenarios that illustrate how to allocate different types of nodes in a cluster. You can adjust these guidelines for your particular situation.

For a 5-node cluster, you can configure one node as a control node (or choose node type control-as-data) and the remaining four nodes as data nodes. To provide high availability (HA) in a 5-node cluster, you need three control nodes. In addition, all the nodes should be able to process data. In this scenario, choose three control-as-data nodes and two data nodes. 

Total # Nodes
in Cluster 

Number of
Control Nodes
Number of
Control-as-Data Nodes
Number of
Data Nodes 
5 (non-HA)104
5 (HA)032
203017
200317

For a 20-node cluster, you still only need three control nodes to manage the cluster. If you need all nodes to process data, the control nodes can double as data nodes, which means you can choose either control or control-as-data for the node type. The remaining nodes can be dedicated data nodes, as shown.

Installation Tips

These tips help you successfully complete the installation process. To begin installation, run the install command and select one of these options:

  • new: starts a new installation
  • add: adds nodes to an existing installation
  • remove: uninstalls MapR packages from an existing installation so you can start a new installation

If you have an installation configuration file, you can supply the name of the file on the command line and skip the interview questions. For example:

# bash /opt/mapr-installer/bin/install --cfg config.example new

Installing a New Cluster

When you install nodes on a new cluster, select new to indicate that this cluster uses a new configuration. The installer then asks you if you have a configuration file. If you answer yes, the installer prompts you for the name of the configuration file. If you answer no, the installer proceeds to the next step, which is to enter the hostnames (or IP addresses) of all control nodes, separated by spaces or commas. Next, enter the hostnames (or IP addresses) of all data nodes. Make sure all nodes are up and running (ping <hostname>) and their hostnames are valid. 

During the interview process, you have an opportunity to change the username and the MapR user password for security purposes.

After you answer all the questions, the installer displays a summary and asks if you want to modify the settings. When you are satisfied with the settings, select (c)ontinue to begin the installation process.

If you want to save the configuration and resume the installation later, select (a)bort. The next time you run the installer, it displays the following message:

A cluster configuration file was found, do you wish to use this configuration? Choosing 'n' allows you to start a fresh installation and will overwrite the previous configuration. (y/n) [n]: y

To use the saved configuration file, enter y for yes.

Ensure that all user information matches across all nodes. Each username and password must match on every node, and must have the same UID. Each groupname must match on every node, and must have the same GID.

To install a client node, select c from the modify menu, then enter the client hostname or IP address.

Installation Process

This section explains what happens when you run the MapR installer. When you use the installer to interactively install and configure the nodes on your cluster, the installation script is launched and it performs these tasks for you:

  • Prepares the system :
    • Checks for necessary resources
    • Checks to see if another version of Hadoop is already installed (if so, you must uninstall this version before you run the installer).
    • Installs and configures OS packages
    • Installs Java
  • Installs MapR software
    • Configures the repositories
    • Installs the MapR packages
    • Configures MapR software

Various information messages are displayed to your output device while the installer is running. The installer verifies system pre-requisites for you, and then checks your system configuration. Next, it launches the interactive question-and-answer process. When you finish the process (and select continue), the installer displays messages about the tasks it is performing.

Installation Summary

During the installation process, the installer asks questions about your cluster configuration. When you finish answering all the questions, the installer displays a summary that includes the choices you selected as well as some other default settings. Here is a sample summary:

Current Information (Please verify if correct)      
==============================================
        Accessibility settings:
            Cluster Name: "my.cluster.com"
            MapR User Name: "mapr"
            MapR Group Name: "mapr"
            MapR User UID: "2000"
            MapR User GID: "2000"
            MapR User Password (Default: mapr): "****"
        Functional settings:
            WireLevel Security: "n"
            MapReduce Services: "n"
            YARN: "y"
            MapR-DB: "y"
            HBase: "n"
            Disks to use: "/dev/xvdf,/dev/xvdg"
            Client Nodes: ""
            Control Nodes: "control-host-01"
            Data Nodes: "data-host-01,data-host-02"
            Repository (will download core software from here):
            "http://package.mapr.com/releases"
            Ecosystem Repository (will download packages like Pig, Hive etc from here):
            "http://package.mapr.com/releases/ecosystem"
            MapR Version to Install: "4.0.1"
            Java Version to Install: "OpenJDK7"
            Allow Control Nodes to function as Data Nodes (Not recommended for large clusters): "n"
            Local Repository: "n"
        Metrics settings:
            Metrics DB Host and Port: ""
            Metrics DB User Name: ""
            Metrics DB User Password: ""
            Metrics DB Schema: ""

(c)ontinue with install, (m)odify options, or save current configuration and (a)bort? (c/m/a) [c]: 

This summary displays all the settings for the current node. Note that the installer does not ask you for values for every setting. Instead, it assigns default values to some settings, and then it allows you to change any setting.

At this stage, you can continue with the install, modify the settings, or save the current configuration and continue later.

Modifying Settings

You can modify any of the settings in the installation summary. If you enter m to modify settings, the installer displays the following menu:

Pick an option to modify           
========================
N] Cluster Name: "my.cluster.com"
u] MapR User Name: "mapr"
g] MapR Group Name: "mapr"
U] MapR User UID: "2000"
G] MapR User GID: "2000"
p] MapR User Password: "****"
S] WireLevel Security: "n"
d] Disk Settings: "/dev/xvdf,/dev/xvdg"
sw] Disk Stripe Width: ""
F] Force Format Disks: "n"
c] Client Nodes: ""
C] Control Nodes: "control-host-01"
D] Data Nodes: "data-host-01,data-host-02"
b] Control Nodes to function as Data Nodes: "n"
v] Version: "4.0.1"
L] Local Repository: "n"
mr] MapReduce1: "n"
db] MapR-DB: "y"
hb] HBase: "n"
y] YARN: "y"
uc] Core Repo URL: "http://package.mapr.com/releases"
ue] Ecosystem Repo URL: "http://package.mapr.com/releases/ecosystem"
dbh] Metrics DB Host and Port: ""
dbu] Metrics DB User: ""
dbp] Metrics DB Password: ""
dbs] Metrics DB Schema: ""
cont] Continue

The following table describes the settings and provides information about modifying them:

SettingDescription
Cluster Name

The installer assigns a default name, my.cluster.com, to your cluster. If you want to assign a different name to your cluster, enter N and the new cluster name. If your environment includes multiple clusters, assign a different name to each one.

The cluster name cannot contain spaces.

MapR User Name The installer assigns a default 'mapr' user name, mapr. If you want to change the user name, enter u followed by the new user name. For more information, see Common Users in Advanced Installation Topics.
MapR User Group Name The default MapR user group name is mapr. To change the user group name, enter g followed by the new user group name.
MapR User IDThe default MapR user ID is 2000. To change this value, enter U followed by the new MapR user ID.
MapR User Group IDThe default MapR user group ID is 2000 (the same as the MapR user ID). To change this value, enter G followed by the new MapR user group ID.
MapR User PasswordThe default MapR user password is mapr. For security, change this password and share it only with other users who are authorized to access the cluster. To change the password, enter p followed by the new password. Notice that the password itself is not displayed. Instead, each character is replaced by an asterisk (*).
Security SettingsBasic security (authentication and authorization) measures are automatically implemented on every MapR cluster. An additional layer of security (data encryption, known as wire-level security) is available, but is disabled by default. If you want to enable wire-level security, enter S and change the setting to yes.
Disks to UseYou must specify which disks to use for the MapR file system for each node. The installer automatically runs the disksetup script to format these disks. If you want to change the list of disks before you continue with the installation, enter d followed by the full path of each disk. Each disk entry can be separated by commas or spaces or a combination of both.
Disk Stripe WidthTo configure the number of disks in a storage pool (known as the stripe width), enter sw and the number of disks you want each storage pool to have. The default stripe width is three.
Force Formatting DisksIf you have disks with previously installed MapR software, they  must be reformatted. Enter F and change the setting to y for yes.
Client Nodes

By default, the quick installer does not install client nodes. To install client nodes in your cluster, select c from the modify menu, then provide the IP address or hostname for each client.

The quick installer only supports Linux-based clients running CentOS, RedHat, or Ubuntu.

The quick installer does not install the MapR POSIX client.

Control NodesIf you need to assign the role of control node to different hostnames, enter C followed by the IP addresses or hostnames of the new control nodes.
Data NodesIf you need to assign the role of data node to different hostnames, enter D followed by the IP addresses or hostnames of the new data nodes.
Control Nodes to Function as Data NodesTo change the functionality of control nodes so they also function as data nodes, select b from the modify menu, then enter y for yes.
MapR Software VersionThe installer always installs the latest available version of MapR software. You can change the version by entering v, then the version number you want to install.
MapReduce1 SettingBy default, all nodes on a cluster are configured to run YARN services and not MapReduce1 (MapReduce for Hadoop 1) services. If you want to run MapReduce1 on your data nodes (instead of YARN or in addition to YARN), enter mr and change the setting to y.
MapR-DB SettingThe default setting for MapR-DB is yes, which assumes that you have an Enterprise Database Edition license and that you are using MapR-DB tables instead of HBase tables. To change this setting, enter db followed by n.
HBase SettingWhen the MapR-DB setting is yes (which is the default setting), the Hbase setting is automatically set to no. If you are using HBase tables instead of MapR-DB tables, enter hb followed by y.
YARN SettingBy default, all nodes on a cluster are configured to run YARN services and not MapReduce1 (MapReduce for Hadoop 1) services. If you want to run MapReduce1 on your data nodes (instead of YARN), enter y and change the setting to n.
MapR Core Repo URLBy default, the MapR core repository is located at http://package.mapr.com/releases. If you want to get the core repository from another URL, enter uc followed by the new URL.
MapR Ecosystem Repo URLBy default, the MapR ecosystem repository is located at http://package.mapr.com/releases/ecosystem. If you want to get the ecosystem repository from another URL, enter ue, then enter the new URL.
MapR Database Schema Information

To specify the MySQL database parameters for the MapR metrics database, enter one of the following options from the modify menu:

  • dbh - to enter the Metrics DB host and port (for example, data-host-01:3306)
  • dbu - to enter the Metrics DB user
  • dbp - Metrics DB password (to authenticate the user)
  • dbs - Metrics DB schema

See Setting up the MapR Metrics Database for more information.

Successful Installation

A successful installation takes approximately 10-30 minutes, depending on how long it takes to reach a quorum of ZooKeeper services. This section shows the messages that appear when control nodes are installed successfully. Data node installation starts immediately after control nodes are installed.

Now running on Control Nodes: [<ip_address>]
* 20:49:06 Interrogating Node(s), Validating Prerequisites, and
Starting Install
WARNING: MapR recommends at least 10 GB of available space on
the root partition.
The node has only 6 GB, but you may still proceed with the
installation.
WARNING: MapR recommends more than 10% of node's physical memory
available for swap space
* 20:49:08 Installing Extra Package Repositories If Needed
* 20:49:09 Installing Extra Package Repositories If Needed for
CentOS/RedHat
* 20:49:18 Detecting Operating System
* 20:49:19 Installing Prerequisite Packages for CentOS/RedHat
* 20:49:51 Detecting Operating System
* 20:49:53 Configuring Firewall for CentOS/RedHat
* 20:49:55 Creating MapR User
* 20:50:03 Installing and Configuring NTP Service
* 20:50:08 Installing OpenJDK Packages If Needed
* 20:50:12 Detecting Operating System
* 20:50:14 Initializing MapR Repository for CentOS/RedHat
* 20:50:59 Installing MapR Packages
* 20:53:16 Disabling MapR Services Until Configured
* 20:53:21 Configuring MapR Services
* 20:53:32 Configuring Disks for MapR File System
* 20:53:45 Starting MapR Services
* 20:53:55 Finalizing MapR Cluster Configuration
* 20:59:46 Configuring MapR Ecosystem
* 20:59:47 Configuring Hive
* 20:59:48 Configuring Spark
MapR Installation
Successful on Control Nodes. Please login via the web console at
https://10.229.12.109:8443 or manage the cluster using 'maprcli' or 'hadoop'
commands

Once control nodes have installed successfully, the quick installer immediately starts to install data nodes. In the meantime, you can access the cluster through the MapR Control System (MCS) via the URL shown in the message.

For a cluster that is configured with two data nodes, the following message appears and indicates the continuation of the installation process:

Now running on Data Nodes: [<ip_address1>,<ip_address2>]
* 20:59:50 Interrogating Node(s), Validating Prerequisites, and
Starting Install
WARNING: MapR recommends at least 10 GB of available space on
the root partition.
The node has only 6 GB, but you may still proceed with the
installation.
WARNING: MapR recommends more than 10% of node's physical memory
available for swap space
WARNING: MapR recommends at least 10 GB of available space on
the root partition.
The node has only 6 GB, but you may still proceed with the
installation.
WARNING: MapR recommends more than 10% of node's physical memory
available for swap space
* 20:59:52 Installing Extra Package Repositories If Needed
* 20:59:53 Installing Extra Package Repositories If Needed for
CentOS/RedHat
* 21:00:12 Detecting Operating System
* 21:00:13 Installing Prerequisite Packages for CentOS/RedHat
* 21:01:55 Detecting Operating System
* 21:01:57 Configuring Firewall for CentOS/RedHat
* 21:01:59 Creating MapR User
* 21:02:08 Installing and Configuring NTP Service
* 21:02:17 Installing OpenJDK Packages If Needed
* 21:03:04 Detecting Operating System
* 21:03:06 Initializing MapR Repository for CentOS/RedHat
* 21:03:50 Installing MapR Packages
* 21:06:00 Disabling MapR Services Until Configured
* 21:06:04 Configuring MapR Services
* 21:06:15 Configuring Disks for MapR File System
* 21:06:28 Starting MapR Services
* 21:06:42 Finalizing MapR Cluster Configuration
* 21:06:57 Configuring MapR Ecosystem
* 21:06:59 Configuring Hive
* 21:07:00 Configuring Spark

Bringing Up the Cluster

When you finish the installation process, the resulting cluster will have a Community Edition license without NFS. You can see the state of your cluster by logging in to the MapR Control System (MCS).

To get your cluster up and running, follow these steps:

  1. Register the cluster to obtain a full Community Edition license.
  2. Apply the license.
  3. Restart the NFS service.

Registering the Cluster

You can register your cluster through the MapR Control System (MCS). Select Manage Licenses from the navigation pane and follow the instructions.

When the License Management dialog box opens, select Add licenses via Web. The next dialog box provides a link to www.mapr.com, where you can register your cluster.

Applying the License

After you register your cluster, click Apply Licenses in the License Management dialog box. For best results, use an Enterprise Edition license (available as a trial license), which entitles you to run NFS on any node on which it is installed. A Community Edition license limits you to one node for NFS, which means you can only have one control node or one control-as-data node (which runs a control node as a data node).

Restarting NFS

The last step in bringing up the cluster is to restart NFS. Even though the installer loads the NFS service on all control and control-as-data nodes, NFS requires a license in order to run (which you applied in the previous step). You can restart the NFS service from the MCS, See Manage Node Services for information.

Once NFS is running, the cluster appears at the mount point /mapr in the Linux file system for all control and control-as-data nodes.

Attachments: