Launch Jobs and Advanced Hadoop MapReduce


About this Course

This course teaches how to work with sequence files, the distributed cache and Apache HBase. Covered are implementing programmatic job control in the driver, MapReduce chaining, and using Use Oozie to manage MapReduce workflows. Lastly, students are shown how to configure MapReduce streaming parameters and to define the programming contract for mappers and reducers. This is the third course in the MapReduce Series from MapR.

What’s Covered in the Course

7: Working with Data
  • Work with Sequence Files
  • Working with the Distributed Cache
  • Working with HBase
Lab Activities
    • Run a MapReduce Program Using HBase as Source
8: Launching Jobs
  • Implement Programmatic Job Control in the Driver
  • Use MapReduce Chaining
  • Use Oozie to Manage MapReduce Workflows
Lab Activities
    • Write a MapReduce Driver to Launch Two Jobs
9: Using Non-Java Programs (Streaming MapReduce)
  • Overview of the MapReduce Streaming Paradigm
  • Configure MapReduce Streaming Parameters
  • Define the Programming Contract for Mappers and Reducers
  • Monitor and Debug MapReduce Streaming Jobs
Lab Activities
    • Implement a MapReduce Streaming Application