course-deatils-thumbnail

Self-Paced Big Data Hadoop Administrator Training

Overview

Become an expert Hadoop Administrator by getting your hands-on Hadoop Clusters, including monitoring the Hadoop Distributed File System and Planning & Deployment. The course will also take a hands-on approach to the Hadoop Ecosystem, which consists of YARN, Map Reduce, HDFS, Cloudera Manager, Hadoop Cluster with Hive, HBase, Pig, Flume, and RDBMS using Sqoop.

Become a Hadoop Administrator by mastering Hadoop Clusters! Cognixia’s Big Data Hadoop Administrator course is specifically designed to provide a hands-on experience to install, configure, and manage the Apache Hadoop platform.

Schedule Classes

Looking for more sessions of this class?

Curriculum

By the end of the module, the student will be able to understand the basics of big data, and will have the foundation of Hadoop daemons and Hadoop architecture.

  • a.Understanding Big Data Basics
    b. Big Data Use Cases
    c. Introduction to Hadoop
    d. Understanding Hadoop Ecosystem
    e. Introduction to HDFS
  • a. Introduction to Namenode
    b. Introduction to Datanode
  • a. Introduction to Secondary Namenode
  • a. Introduction to MapReduce
  • a. Introduction to JobTracker
    b. Introduction to TaskTracker
  • a. Summarizing Hadoop Architecture
    b. Roles and Responsibilities of a Hadoop Administrator

By the end of the module, the student will be able to create a multi-node Hadoop cluster. Preparing students to create Hadoop clusters, this module gives a deep understanding of how Linux works, how to setup virtual machines, and how to set up the password-less SSH.

  • Linux internals
  • i. Commands that are required
    ii. Linux basics
  • Hadoop Cluster Installation Pre-requisites
  • Pre-requisites of Hadoop Installation
  • i. Software Downloads
    ii. Preparing your VM
    iii. Enabling VM with VMware
    iv. Understanding mandatory changes in the operating system
  • Installation and Configuration
  • i. Understanding Hadoop cluster installation modes
    ii. Understanding Hadoop Version 1 installation and configuration
    iii. Password-less SSH setup
  • Hands-On Practice for creating a Hadoop cluster
  • Helping individually in practicing Hadoop cluster installation
  • By the end of the module, the student will be able to understand how to plan a production cluster of Hadoop. Students will understand the hardware and software requirements of a Hadoop cluster, performance tuning after cluster creation, and benchmarking.

By the end of the module, the student will be able to administrate a Hadoop cluster. Students will understand how to copy data from one Hadoop cluster to another Hadoop cluster, how to use different Hadoop schedulers to run jobs, how to perform backup and recovery of metadata, data, configurations, and application data, and how to recover cluster data.

By end of the module, the student will be able to understand how the next version of Hadoop and YARN works. An understanding of the new features of Hadoop Version 2 and Yarn framework will also be provided, and the knowledge to deploy a Hadoop 2 cluster in a pseudo-distributed and multi distributed mode.

  • i. Hadoop 2.0 new features
    ii. YARN
  • i. Understanding Resource Manager
    ii. Understanding Application Master
    iii. Understanding Node Manager
    iv. Understanding Hadoop 2 Job Execution Framework
  • Hadoop 2 Multi-node cluster creation
  • i. Pre-requisites of Hadoop Installation
    ii. Software Downloads
    iii. Preparing your VM
    iv. Enabling VM with VMware
    v. Understanding mandatory changes in the operating system
    vi. Installation and Configuration
    vii. Understanding Hadoop version 2 installation and configuration
    viii. Passwordless SSH setup

By the end of the module, the student will be able to learn how to achieve high availability, how to enable Federation in Namenode, and what the various improvements in Hadoop 2 are.

  • Practice Hadoop 2 Multi-node Cluster Creation
  • Helping individuals in practicing Hadoop 2 cluster installation
  • a. Sample Yarn Job execution
    c. Understanding Issues of Hadoop 1
    d. Understanding improvements in Hadoop 2
    e. Namenode Federation
  • Enable segregation of HDFS using multiple Namenodes
  • Namenode – High Availability
  • i. Achieving Namenode High-Availability using Quorum Journal Manager
    ii. Achieving Namenode High-Availability using Network File System
  • Implementation of NN High Availability
  • Helping individuals achieving Namenode High Availability

By end of the module, the student will be able to administrate the basics of Hadoop ecosystem components like Hive, Hbase, Sqoop, Flume, and Pig.

  • Hadoop Ecosystem Introduction
  • Understanding the integration of Hadoop ecosystem
  • Touchbase with Hive
  • What is Hive?
    ii. Architecture of Hive
    iii. Understanding Hive meta-store concepts
  • HBase
  • Understading HBase Basics
    ii. Understanding HBase storage Model
    iii. Understanding HBase Architecture
    iv. Cluster Installation and Configuration
  • Pig
  • What is Pig?
    ii. How Pig integrates with Hadoop cluster?
    iii. Demo of Pig Jobs using MapReduce
  • Sqoop
  • What is Sqoop?
    ii. How to import and export the data from Sqoop to RDBMS?
    iii. Example of Sqoop jobs using MySQL
  • Flume
  • What is F?
    ii. Sample Flume jobs

By the end of the module, the student will be able to build a multi-node Cloudera cluster using Cloudera Manager, will know how to achieve high availability, and how to add a new node into the cluster using Cloudera Manager.

  • Understanding the internals of Cloudera Manager
    a. Understanding the automation of Hadoop installation using Cloudera Manager
    b. Understanding Cloudera Hadoop Distribution and Cloudera Manager
    c. Understanding the underlying directory structure of Cloudera Hadoop
    d. Cloudera Hadoop Cluster Installation – CDH

Reach out to us for more information

Interested in this course? Let’s connect!

  • This field is for validation purposes and should be left unchanged.

Course features

Course Duration
Course Duration

36 hours of live, online, instructor-led training

24x7 Support
24x7 Support

Technical & query support round the clock

Lifetime LMS Access
Lifetime LMS Access

Access all the materials on LMS anytime, anywhere

Price Match Gurantee
Price match Gurantee

Guranteed best price aligning with quality of deliverables

FAQs

Yes, the course completion certificate is provided once you successfully complete the training program. You will be evaluated on parameters such as attendance in sessions, an objective examination, and other factors. Based on your overall performance, you will be certified by Cognixia.