Big Data and the Hadoop Ecosystem

Duration: 3 Days

Course Overview

This 3 day hands-on course is suitable for anybody wishing to understand the concepts and technologies involved with exploiting big data using the Hadoop Ecosystem. Attendees will learn how to set up and write applications for Hadoop, Pig, Hive and Impala.

How can I attend my course?

On-line from
your chosen location

At our dedicated
training facility

On-site at
your premises

Course Content

Big Data

• What is big data?
• Technical challenges
• Structured, semi-structured and unstructured data
• Big data storage
• NoSql

Hadoop

• What is Hadoop?
• The Hadoop Ecosystem
• Hadoop versus relational databases
• Mapping and reducing
• Writing map reduce scripts
• Combining and partitioning
• Hadoop streaming
• Installing and configuring Hadoop

Pig

• What is Pig?
• Preprocessing data
• Using the Pig shell Grunt
• Loading data and schemas
• Generating relations
• Displaying and storing results
• Designing Pig scripts

Hive

• What is Hive?
• Creating the data warehouse
• Mapping structure onto stored data
• Hive Query Language (HiveQL)

Impala

What is Impala?
Impala architecture
Using the Impala shell
Impala SQL

Case Study

Develop a map reduce application using one or more tools from scratch.

You will receive a full set of course notes
and all supporting materials for your course.

Hard Copy Delivered to your premises or Downloaded to a chosen device.