Big Data Analytics With Data Scientist

This course provides a blend of theoretical and practical training, which will enable the students to participate in big data and analytics projects. It covers a wide range of topics from basics to becoming and independent analyst. We will be using R as a statistical tool and Hadoop to handle some of the publicly available Big Data to deliver on insights with teh dual objective to impart best practices and equipping the analyst with necessary toolkit.

 

Who should learn Big Data Analytics?

 

This course is intended for individuals seeking to develop an understanding of Data Science from the perspective of a practicing Data Scientist, including:

 
  • Managers of team of Business Intelligence, analytics and big data professionals
  • Current Business and Data Analyst looking to add big data analytics to their skills.
  • Data and database professionals looking to exploit their analytic skills in a big data environment.
  • Recent college graduates and graduate students with academic experience in a related discipline looking to move into the world of Data Science and big data.
 

Big Data Analytics with Data Science Course Outline

 

→ 1. Introduction

  • Big Data Overview
  • What is Big Data Analytics?
  • Necessity for Big Data Analytics
  • Role of a Data Analyst
  • What is Data Science?
  • Necessity for Data Science
  • Role of Data Scientist

→ 2. Use Cases

  • Finance
  • Retail
  • Advertising
  • Defense and Intelligence
  • Telecommunications and Utilities
  • Healthcare and Pharmaceuticals

→ 3. Data Analytics Proces

  • Preparation
  • PreProcessing
  • Analysis
  • Post Processing

→ 4. Data Preparation

  • Planning
  • Data Collection
  • Data Selection

→ 5. Tools for Data Preparation

  • Introduction to SQL DB's
  • Introduction to NoSql DB's
  • Key / Value pair
  • MongoDB
  • Cassandra
  • Graph DB's (Neo4j)
  • Hands on Exercise : Using SQL and NoSql DB's

→ 6. Data Preparation – Import/Export

  • Sqoop
  • Flume
  • Hands on Exercise : Usage of Tools

→ 7. PreProcessing

  • Data Cleaning
  • Data Filtering
  • Data Completion
  • Data Correction
  • Data Standardization
  • Data Transformation

→ 8. Tools for Data PreProcessing

  • Data Preprocessing using Pig
  • Writing Pig Latin scripts and processing data
  • Data Preprocessing using Hive
  • Writing Hive Scripts and processing data
  • Hands on Exercise : Pig and Hive

→ 9. Data Analysis Introduction

  • Recommendation
  • Classification
  • Clustering
  • Mahout

→ 10. Recommendataion

  • Introduction to Recommendations
  • Making recommendations, various techniques
  • Hands on Exercise for Recommendations

→ 11. Classification

  • Classification System Overview
  • Classification process
  • Naive Bayes Classifier
  • Descision Trees
  • Examples of Classification

→ 12. Clustering

  • Clustering basics
  • Hierarchical clustering
  • K-Means clustering
  • Running clustering example
  • Exploring distance measures

→ 13. Data Visualization using R

  • Language basics
  • Data Frames
  • Vectorized operations on Data Frames
  • Selection
  • Projection
  • Transformation

→ 14. Hands on R

Online Courses Videos