Apache Hadoop Essentials


About this Course

This course introduces you to the basics of Apache Hadoop. The course begins with a brief introduction to the Hadoop Distributed File System and MapReduce, then covers several open source ecosystem tools, such as Apache Spark, Apache Drill, and Apache Flume. Finally, these tools are applied to real-world use cases. Ideal for business managers, students, developers, administrators, analysts or anyone interested in learning the fundamentals of transitioning from traditional data models to big data models.

What’s Covered in the Course

3: Core Elements of Apache Hadoop
  • Local and Distributed File Systems
  • Data Management in the Hadoop File System
  • Review of the MapReduce Algorithm
4: The Apache Hadoop Ecosystem
  • Overview of the Apache Ecosystem
  • Administration: ZooKeeper, YARN
  • Ingestion: Flume, Oozie, Sqoop
  • Processing: Spark, HBase, Pig
  • Analysis: Hive, Drill, Mahout
5 : Solving Big Data Problems with Apache Hadoop
  • Data Warehouse Optimization
  • Recommendation Engine
  • Large-Scale Log Analysis