In this tutorial, we'll use Pig to run a MapReduce job that counts the words in the file
/in/constitution.txt in the
mapr user's directory on the cluster, and store the results in the file
- First, download the file: select Tools > Attachments on this Confluence page (see the top-right corner of the page) and right-click
constitution.txtto save it.
- Load the file onto the cluster and place it in the directory
Perform the following steps:
- In the terminal, type the command
pigto start the Pig shell.
grunt>prompt, type the following lines (press ENTER after each):
After you type the last line, Pig starts a MapReduce job to count the words in the file
- When the MapReduce job is complete, type
quitto exit the Pig shell and take a look at the contents of the directory
/myvolume/wordcountto see the results.