Data Analysis with Apache Drill


About this Course

This introductory Apache Drill course covers how to use Drill to explore structured or unstructured, known or unknown data, without writing code. You will explore and run SQL queries on a variety of data types, including Parquet, JSON, and CSV files. You will also join data from multiple data sources without having to do any transformation on the data. The course goes on to describe how a query is received and executed by Drill. You will also learn the different services involved at each step, how Drill optimizes a query for distributed SQL execution, and how to troubleshoot and tune Drill queries.

Duration : 2 days

What's Covered

Welcome to Class
  • Course Introduction
  • Prepare your Lab Environment
  • Configure WebEx Training Center
Lab Activities
    • Prepare Your Lab Environment
    • Configure the Training Environment
1: Interface with Apache Drill
  • Introduction to Apache Drill
  • Explore Data that can be used with Drill
  • Interface with Apache Drill
Lab Activities
    • License and Verify Your Cluster
    • Interface with Apache Drill
2: SQL Analytics with Apache Drill
  • Query Structured Table Data
  • Query Dynamic and Complex Data
  • Query Data Files
  • Perform Complex Operations
Lab Activities
    • Query Structured Table Data
    • Query Complex Data
    • Query Data Files
    • Combine Multiple Data Types and Sources
3: Explore and Visualize Data with Apache Drill
  • Tables, Views, and Temporary Tables
  • Explore Unknown Data
  • Explore and Visualize Data with BI Tools
Lab Activities
    • Create and Drop Tables, Temporary Tables, and Views
    • Discover Unknown Data
    • Visualize Data with Arcadia Data
4: Advanced Apache Drill Operations
  • Define and Query Data with Secondary Indexes
  • Advanced Query Operations
  • Extend Drill with Custom Functions
Lab Activities
    • Query Secondary Indexes
    • Perform Advanced Queries
5: Monitor and Tune Drill Performance
  • Drill Architecture and Query Execution Process
  • Monitor Drill Activity and Resources
  • Performance Tuning
  • Use Drill with Secured Data
Lab Activities
    • Optimize Queries
    • Query Data on a Secure Cluster
6: Troubleshoot and Debug Queries
  • Examine Drill Error Messages
  • Configure Log File Settings
  • Troubleshoot Apache Drill
Lab Activities
    • Debug a Failed Query