Get MapR Cluster Certified!
Try the MapR Cluster Administrator v6 Exam for $99 (usually $250)

Advanced Apache Spark (Spark v2.1)

REGISTER TO BEGIN

About this Course

This course teaches you how to build data pipeline applications using Spark Streaming, Spark SQL, Spark GraphFrame and MLlib. You’ll learn about Spark Streaming architecture, data pipeline use cases, DStreams, and property graph operations.

This is the third and final course in the Apache Spark v2.1 Series.

What’s Covered in the Course

6: Create an Apache Spark Streaming Application
  • Describe Spark Streaming Architecture
  • Create a Spark Structured Streaming Application
  • Apply Operations on Streaming DataFrames
  • Define Windowed Operations
  • Describe How Streaming Applications are Fault Tolerant
Lab Activities
    • Load and Inspect Data Using the Spark Shell
    • Use Spark Streaming with the Spark Shell
    • Build and Run a Streaming Application with SQL
    • Build and Run a Streaming Application with Windows and SQL
7: Use Apache Spark GraphFrames
  • Describe GraphFrame
  • Define Regular, Directed, and Property Graphs
  • Create a Property Graph
  • Perform Operations on Graphs
Lab Activities
    • Analyze Data with GraphFrame
8: Use Apache Spark MLlib
  • Describe Apache Spark MLlib Machine Learning Algorithms
  • Use Collaborative Filtering to Predict User Choice
Lab Activities
    • Load and Inspect Data Using Spark Shell
    • Use Spark to Make Movie Recommendations
    • Analyze a Simple Flight Example with Decision Trees