With MapR Academy Pro, you get advanced courses, lab exercises, custom sandboxes, quizzes, interactivity, and course certificates available by going Pro.Learn more about going Pro.
About this Course
DEV 361 is the second in the Apache Spark series. You will learn to create and modify pair RDDs, perform aggregations, and control the layout of pair RDDs across nodes with data partitioning.
This course also discusses Spark SQL and DataFrames, the programming abstraction of Spark SQL. You will learn the different ways to load data into DataFrames, perform operations on DataFrames using DataFrame functions, actions and language integrated queries, and create and use user-defined functions with DataFrames.
This course also describes the components of the Spark execution model using the Spark Web UI to monitor Spark applications. The concepts are taught using scenarios in Scala that also form the basis of hands-on labs. Lab solutions are provided in Scala and Python.
- Completion of ESS 100, ESS 101
- Basic Hadoop knowledge and intermediate linux knowledge
- Experience using a text editor such as vi
- Terminal program installed; familiarity with command-line options such as
- Knowledge of functional programming with Scala or Python, and experience with SQL
This course is part of the preparation for the MapR Certified Spark Developer (MCSD) certification exam.
Lesson 4 - Work with Pair RDD
- Describe and Create Pair RDD
- Lab 4.1: Load and Explore Data
- Apply Transformations and Actions to Pair RDD
- Lab 4.2: Join Pair RDD
- Control Partitioning Across Nodes
- Lab 4.3: Explore Partitioning
Lesson 5 - Work with Spark DataFrames
- Create Apache Spark DataFrames
- Lab 5.1: Create DataFrames Using Reflection
- Explore Data in DataFrames
- Lab 5.2: Explore Data in DataFrames
- Create User Defined Functions
- Lab 5.3: Create and Use User Defined Functions
- Repartition DataFrames
- Lab 5.4: Build a Standalone Application
Lesson 6 - Monitor a Spark Application
- Describe the Components of the Spark Execution Model
- Use Spark Web UI to Monitor Spark Applications
- Debug and Tune Spark Applications
- Lab 6.1: Use the Spark UI
- Lab 6.2: Find Spark System Properties