DA 440 - Query and Store Data with Apache Hive

Register Now

About this Course

This course begins with a review of SQL-on-Hadoop tools, then covers how to create, load, query, and manipulate tables in Hive. You will learn how to use Hive to query structured data without writing MapReduce code. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language.

Together with DA 450 - Transform Data with Apache Pig, you can learn how to use Pig and Hive as part of a single data flow in a Hadoop cluster.

What’s Covered

Course Lessons Lab Activities
1: Hive in the Hadoop Ecosystem
Hive Use Cases
Steps in the Data Pipeline
Hive in the Hadoop Ecosystem
Data Types Use With Hive
Connect to the Hive CLI
Cast Data
2: Create and Load Data
Create Databases and Internal Tables
Create External Tables and Partitioned Tables
Load Data Into Tables and Databases
Alter and Drop Tables
Create a Database
Create a Simple Table
Create Partitioned and External Tables
Load Data Into Tables
Examine Databases and Tables
3: Query and Manipulate Data
Query, Sort, and Filter Data
Manipulate Data With User-defined Functions
Combine and Store Tables
Query Data With SELECT
Query Data With UDFs
Combine and Store Data

Get Certified

This course is part of the preparation needed for the MapR Certified Data Analyst (MCDA) certification exam.

Prerequisites

  • Completion of the MapR Academy on-demand courses: ESS 100 - 102
  • Basic Hadoop knowledge
  • Terminal program installed; familiarity with command-line navigation