Dataware for data-driven transformation

Data Analysis with Apache Drill


About this Course

This introductory Apache Drill course covers how to use Drill to explore known or unknown data without writing code. It also describes how a query is received and executed by Drill. You will write SQL queries on a variety of data types, including such as Parquet and JSON. You will also learn the different services involved at each step, and how Drill optimizes a query for distributed SQL execution.

Duration : 2 days

What's Covered

1: Introduction to Apache Drill
  • Describe Apache Drill
  • Explore Key Features of Apache Drill
  • Review Data Types and Formats
Lab Activities
    • Explore the Drill SQL Interfaces
    • Perform Drill SQL Queries
2: SQL Queries with Apache Drill
  • Plan Data Analysis Objectives
  • Perform SQL Queries on Structured Data
  • Perform SQL Queries on a Variety of Data Types
  • Combine Data Types in SQL Queries
Lab Activities
    • Describe Schemas
    • Query Marketing DataQuery JSON Data
    • Combine Data Types
3: Apache Drill Operations and Functions
  • Create and Drop Tables and Views
  • Use Nested Data and Window Functions
  • Extend Drill with Custom Functions
  • Perform an End-to-End Drill Data Analysis
Lab Activities
    • Create and Drop Tables and Views
    • Perform Nested Data Functions
    • Perform Aggregate Window Functions
    • Perform an End-to-End Data Analysis
4: Apache Drill Architecture
  • Describe Drill Execution Process
  • Sketch Drill Architecture Components
Lab Activities
    • Order Query Process Steps
    • Sketch Drillbit Architecture
5: Query Plans and Optimization
  • Describe a Physical Query Plan
  • Examine a Physical Query Plan
  • Optimize Queries
Lab Activities
    • Examine Physical Query Plans
    • Create a Partitioned Table
6: Apache Drill Performance and Debugging
  • Analyze Drill Error Messages
  • Configure Log File Settings
  • Troubleshoot Apache Drill
Lab Activities
    • Examine Drill Log Files