MapR 5.0 Documentation : Launching Clusters and Jobs

You are now ready to launch the cluster you created and use it to run specific jobs.

Launching a Cluster

You can launch a MapR cluster from the Cluster Templates page where you created it, or you can go to Project > Data Processing > Clusters > Launch Cluster.

Enter a cluster name, then select a cluster template, base image, and keypair. Then click Create.

 

Wait until the cluster status becomes Active.

Launching a Job 

In order to launch a job, you need to define: 

  • Data sources: located in Swift storage or MapR-FS
  • Job binary files, such as jar files
  • Job type (MapReduce, Pig, Hive, and so on) and libraries 

 

From this point, you can use the UI in the same way that you would use it for any Sahara plugin. For more details, see the OpenStack Sahara documentation.

Adding Data Sources 

Go to Project > Data Processing > Data Sources > Create Data Source. Specify a data source name and type: 

  • Swift: specify a URL that points to the object (for example: demo/sahara/<object>). Also enter a username and password. 
  • MapR-FSMapR-FS data sources must be added manually, using the Sahara CLI. 

For example:

sahara data-source-create --name <data-source-name> --type maprfs --url maprfs:<maprfs-path>

Creating Job Binaries 

Go to Project > Data Processing> Job Binaries > Create Job Binary. 

Specify the name of the job binary name and select the storage type: 

  • Data Processing internal database

  • Swift 

You can define the job binary by: 

  • Choosing an existing file

  • Uploading a new file

  • Creating a script to be uploaded dynamically 

Creating a Job 

Go to Project > Data Processing > Jobs > Create Job. Enter a job name and select the job type: Pig, Hive, MapReduce, and so on.

Go to the Libs tab, select the job binary, and click Choose. Click Create to create the job. 

Launching a Job on a MapR Cluster 

Go to Project > Data Processing > JobsSelect the job from the list, then select an action. 

On the Job tab, select the cluster that will run the job. 

On the Configure tab, specify any required parameters and click Launch to start executing the job. 

When the job is complete, its status will change to SUCCEEDED.