Hadoop Administrator

Take your knowledge to the next level with Hadoop Training.

This is a 24 hours instructor lead developer training course provides system administrators a comprehensive understanding of all the steps necessary to operate and manage Hadoop clusters. The course covers installation, configuration, load balancing and tuning your cluster.

 

Upon completion of the course, attendees can clear Hadoop administrator certification from Cloudera or from HortonWorks. Certification is a great differentiator; it helps establish individuals as leaders in their field, providing customers with tangible evidence of skills and expertise.

 

HADOOP DEVOLOPMENT

→ Introduction

  • What is Cloud Computing
  • What is Grid Computing
  • What is Virtualization
  • How above three are inter-related to each other
  • What is Big Data
  • Introduction to Analytics and the need for big data analytics
  • Hadoop Solutions - Big Picture
  • Hadoop distributions
  • Comparing Hadoop Vs. Traditional systems
  • Volunteer Computing
  • Data Retrieval - Radom Access Vs. Sequential Access
  • NoSQL Databases

→ The Motivation for Hadoop

  • Problems with traditional large-scale systems
  • Requirements for a new approach

→ Hadoop: Basic Concepts

  • What is Hadoop?
  • The Hadoop Distributed File System
  • How MapReduce Works
  • Anatomy of a Hadoop Cluster

→ Hadoop demons

  • Namenode
  • Datanode
  • Secondary namenode
  • Job tracker
  • Task tracker

→ HDFS at detail

  • Blocks and Splits
  • Replication
  • Data high availability
  • Data Integrity
  • Cluster architecture and block placement

→ Programming Practices & Performance Tuning

  • Developing MapReduce Programs in
    • Local Mode
    • Pseudo-distributed Mode
    • Fully distributed mode

→ Writing a MapReduce Program

  • Examining a Sample MapReduce Program
  • Basic API Concepts
  • The Driver Code
  • The Mapper
  • The Reducer
  • Hadoop's Streaming API

→ Setup Hadoop cluster

  • Install and configure Apache Hadoop
  • Make a fully distributed Hadoop cluster on a single laptop/desktop
  • Install and configure Cloudera Hadoop distribution in fully distributed mode
  • Install and configure Horton Works Hadoop distribution in fully distributed mode
  • Monitoring the cluster
  • Getting used to management console of Cloudera and Horton Works

→ Hadoop Security

  • Why Hadoop Security Is Important
  • Hadoop's Security System Concepts
  • What Kerberos Is and How it Works
  • Configuring Kerberos Security
  • Integrating a Secure Cluster with Other Systems

→ Managing and Scheduling Jobs

  • Managing Running Jobs
  • Hands-On Exercise
  • The FIFO Scheduler
  • The FairScheduler
  • Configuring the FairScheduler
  • Hands-On Exercise

→ Cluster Maintenance

  • Checking HDFS Status
  • Hands-On Exercise
  • Copying Data Between Clusters
  • Adding and Removing
  • Cluster Nodes
  • Rebalancing the Cluster
  • Hands-On Exercise
  • NameNode Metadata Backup

→ Cluster Monitoring and Troubleshooting

  • General System Monitoring
  • Managing Hadoop's Log Files
  • Using the NameNode and
  • JobTracker Web UIs
  • Hands-On Exercise
  • Cluster Monitoring with Ganglia
  • Common Troubleshooting Issues
  • Benchmarking Your Cluster

Hadoop Ecosystem covered as part of Hadoop Administrator

→ Eco system component: Ganglia

  • Install and configure Ganglia on a cluster
  • Configure and use Ganglia
  • Use Ganglia for graphs.

→ Eco system component: Nagios

  • Nagios concepts
  • Install and configure Nagios on cluster
  • Use Nagios for sample alerts and monitoring

→ Eco system component: Hive

  • Hive concepts
  • Install and configure hive on cluster
  • Create database, access it console
  • Develop and run sample applications in Java/Python to access hive

→ Eco system component: Sqoop

  • Install and configure sqoop on cluster
  • Import data from Oracle/Mysql to hive

→ Overview of other Eco system component:

  • Oozie, Avro, Thrift, Rest, Mahout, Cassandra, YARN, MR2 etc
Contact for Demo
Training Enquiry Form





Online Courses Videos