MODULE 1: PYTHON PROGRAMMING AND FLASKFRAMEWORK
- Introduction
- What is Python?
- Why does Data Science require Python?
- Installation of Anaconda
- Understanding Jupyter Notebook (IDE)
- Basic commands in Jupyter Notebook
- Understanding Python Syntax
- and Operators
- Variables, Data Types, and Strings
- Lists, Sets, Tuples and Dictionaries
Control Flow & Conditional Statements
- Conditional Operators, Arithmetic Operators and Logical Operators
- if, elif and else Statements
- While Loops and control flow
- For Loops and nested loops
- pass, break and continue
- Nested Loops and List and Dictionary Comprehensions
- What is function and types of functions
- Code optimization and argument functions
- Scope
- Lambda Functions
- Map, Filter and Reduce
- Importing a Module Using help() and dir() Aliasing or Renaming
- Some Important Modules in Python: math module, random module, datetime and os module
File Handling
- Create, Read, Write files and Operations in File Handling
- Errors and Exception Handling Class and Objects
- Create a class
- Create a object
- The __init__()
- Modifying Objects
- Object Methods
- Self
- Modify the Object Properties
- Delete Object
- Pass Statements
MODULE 2: DATA ANALYSIS IN PYTHON
Numpy - Numerical Python
- Introduction to Array
- Creation and Printing of array
- Basic Operations in Numpy
- Indexing
- Mathematical Functions of Numpy
- Series and DataFrames
- Data Importing and Exporting through Excel, CSV Files
- Data Understanding Operations
- Indexing and slicing and More filtering with Conditional Slicing
- Groupby, Pivot table and Cross Tab
- Concatenating and Merging Joining
- Descriptive Statistics
- Removing Duplicates
- String Manipulation
- Missing Data Handling
DATA VISUALIZATION
- Data Visualization Using Matplotlib And Seaborn
- Introduction to Matplotlib
- Basic Plotting
- Properties of plotting
- About Subplots
- Line plots
- pie chart and Bar Graph
- Histograms
- Box and Violin Plots
- Scatterplot
- Case Study On Exploratory Data Analysis (eda) & Visualizations
- What is EDA?
- Uni - Variate Analysis
- Bi - Variate Analysis
- More on Seaborn Based Plotting Including Pair Plots, Catplot, Heat Maps, Count plot along with matplotlib plots
UNSTRUCTURED DATA PROCESSING
- Regular Expressions
- Structured Data and Unstructured Data
- Literals and Meta Characters
- How to Regular Expressions using Pandas?
- Inbuilt Methods
- Pattern Matching
Project On Web Scraping: Data Mining And Exploratory Data Analysis
- Data Mining (WEB - SCRAPING)
- This project covers the main four steps of Data Science Life Cycle which involves
- Data Collection
- Data Mining
- Data Preprocessing
- Data Visualization Ex: Text, CSV, TSV, Excel Files, Matrices, Images
MODULE 3: ADVANCED STATISTICS
Data Types in Statistics
- Statistics in Data science
- What is Statistics?
- How is Statistics used in Data Science?
- Population and Sample
- Parameter and Statistic
- Variable and its types
Data Gathering Techniques
- Data types
- Data Collection Techniques
- Sampling Techniques
- Convenience Sampling, Simple Random Sampling
- Stratified Sampling ,Systematic Sampling and Cluster Sampling
Descriptive Statistics
- Data types
- Data Collection Techniques
- Sampling Techniques
- Convenience Sampling, Simple Random Sampling
- Stratified Sampling ,Systematic Sampling and Cluster Sampling
- What is Univariate and Bi Variate Analysis?
- Measures of Central Tendencies
- Measures of Dispersion
- Skewness and Kurtosis
- Box Plots and Outliers detection
- Covariance and Correlation
Probability Distribution
- Probability and Limitations
- Discrete Probability Distributions
- Bernoulli, Binomial Distribution, Poisson Distribution
- Continuous Probability Distributions
- Normal Distribution, Standard Normal Distribution
Inferential Statistics
- Sampling variability and Central Limit Theorem
- Confidence Intervals
- Hypothesis Testing
- Z -test, t-test
- Chi – Square Test
- F -Test and ANOVA
MODULE 4: Data Base (SQL) + Reporting Tool (Power BI)
SQL for Data Science
- Introduction to Databases
- Basics of SQL
- DML, DDL, DCLand Data Types
- Common SQLcommands using SELECT, FROM and WHERE
- Logical Operators in SQL
- Filtering and Sorting
- Advanced filtering using IN, OR and NOT
- Sorting with GROUPBYand ORDER BY
- SQLJoins
- INNER and OUTER joins to combine data from multiple tables
- RIGHT, LEFTjoins to combine data from multiple tables
- SQLAggregations
- Common Aggregations including COUNT, SUM, MIN and MAX
- CASE and DATE functions as well as work with NULLvalues
- Subqueries and Temp Tables
- Subqueries to run multiple queries together
- Temp tables to access a table with more than one query
- Window Functions
- ROW_NUMBERS(), RANK(), DENSE_RANK(), LAG, LEAD, SUM, COUNT, AVG
Introduction To Power Bi
- What is Business Intelligence?
- Power BI Introduction
- Quadrant report
- Comparison with other BI tools
- Power BI Desktop overview
- Power BI workflow
- Installation query addressal
Data Import And Visualizations
- Data import options in Power BI
- Import from Web (hands on)
- Why Visualization?
- Visualization types
- Categorical data visualization
- Trend Data viz
- Visuals for Filtering
- Slicer details and use
- Formatting visuals
- KPI visuals
- Tables and Matix
Power Queries
- Power Query Introduction
- Data Transformation - its benefits
- Introducing ribbons
- Queries panel
- M Language briefing
- Power BI Datatypes
- Changing Datatypes of columns
- Filtering
- Inbuilt column Transformations
- Inbuilt row Transformations
- Combine Queries
- Merge Queries
Power Pivot And Introduction To Dax
- Power Pivot
- Intro to Data Modelling
- Relationship and Cardinality
- Relationship view
- Calculated Columns vs Measures
- DAX Introduction and Syntax
Data Analysis Expressions
- DAX recap
- DAX logical functions
- DAX text functions
- DAX math and statistical Functions
- DAX aggregation function
- DAX filter function
- DAX time intelligent function
- Creating a Date Dimension table
- Related aspects with tables
Login, Publish To Web And Rls
- Power BI services
- Dashboard creation
- Web Content, Image, Text Box, Video
- Dashboard formatting
- vSharing your dashboard
- RLS introduction
Miscellaneous Topics
- Visual Interactions
- Drill Through
- Drilldown
- Conditional Formatting
- Creating buttons in Power BI reports
- Creating Python Script Visuals
- This module will end up with a project
MODULE 5: MACHINE LEARNING - SUPERVISED LEARNING
Introduction
- What Is Machine Learning?
- Supervised Versus Unsupervised Learning
- Regression Versus Classification Problems Assessing Model Accuracy
Introduction And Linear Algebra
- Supervised Versus Unsupervised Learning
- Introduction to Matrices
- Vector spaces, including dimensions, Euclidean spaces, closure properties and axioms
- Eigenvalues and Eigenvectors, including how to find Eigenvalues and the corresponding Eigenvectors
REGRESSION TECHNIQUES
- Linear Regression
- Simple Linear Regression:
- Estimating the Coefficients
- Assessing the Coefficient Estimates
- R Squared and Adjusted R Squared
- M SE and RMSE
- Estimating the Regression Coefficients
- OLS Assumptions
- Multicollinearity
- Feature Selection
- Gradient Discent
Evaluating the Metrics of Regression Techniques
- Homoscedasticity and Heteroscedasticity of error terms
- Residual Analysis
- Q-Q Plot
- Cook's distance and Shapiro-Wilk Test
- Identifying the line of best fit
- Other Considerations in the Regression Model
- Qualitative Predictors
- Interaction Terms
- Non-linear Transformations of the Predictors
Polynomial Regression
- Why Polynomial Regression
- Creating polynomial linear regression
- evaluating the metrics
Regularization Techniques
- Lasso Regularization
- Ridge Regularization
- ElasticNet Regularization
Case Study on Linear, Multiple Linear Regression, Polynomial, Regression using Python
- PROJECT: A project on a use case will challenge the Data Understanding, EDA, Data Processing and above Regression Techniques
CLASSIFICATION TECHNIQUES
- Logistic regression
- An Overview of Classification
- Difference Between Regression and classification Models
- Why Not Linear Regression?
- Logistic Regression:
- The Logistic Model
- Estimating the Regression Coefficients and Making Pr edictions
- Logit and Sigmoid functions
- Setting the threshold and understanding decision boundary
- Logistic Regression for >2 Response Classes
- Evaluation Metrics for Classification Models:
- Confusion Matrix
- Accuracy and Error rate
- TPR and FPR
- Precision and Recall, F1 Score
- AUC – ROC
- Kappa Score
- Principle of Naive Bayes Classifier
- Bayes Theorem
- Terminology in Naive Bayes
- Posterior probability
- Prior probability of class
- Likelihood
- Types of Naive Bayes Classifier
- Multinomial Naive Bayes
- Bernoulli Naive Bayes and Gaussian Naive Bayes
TREE BASED MODULES
- Decision Trees
- Decision Trees (Rule Based Learning)
- Basic Terminology in Decision Tree
- Root Node and Terminal Node
- Regression Trees and Classification Trees
- Trees Versus Linear Models
- Advantages and Disadvantages of Trees
- Gini Index
- Overfitting and Pruning
- Stopping Criteria
- Accuracy Estimation using Decision Trees
Case Study: A Case Study on Decision Tree using Python
- Resampling Methods
- Cross-Validation
- The Validation Set Approach Leave-One-Out Cross-Validation
- k -Fold Cross-Validation
- Bias-Variance Trade-Offfor k-Fold Cross-Validation
Ensemble Methods in Tree Based Models
- What is Ensemble Learning?
- What is Bootstrap Aggregation Classifiers and how does it work?
- Random Forest
- What is it and how does it work?
- Variable selection using Random Forest
- What is it and how does it work?
- Hyper parameter and Pro's and Con's
- Case Study: Ensemble Methods - Random Forest Techniques using Python
DISTANCE BASED MODULES
- K Nearest Neighbors
- K-Nearest Neighbor Algorithm
- Eager Vs Lazy learners
- How does the KNN algorithm work?
- How do you decide the number of neighbors in KNN?
- Curse of Dimensionality
- Pros and Cons of KNN
- How to improve KNN performance
- Case Study: A Case Study on k-NN using Python
- Support Vector Machines
- The Maximal Margin Classifier
- HyperPlane
- Support Vector Classifiers and Support Vector Machines
- Hard and Soft Margin Classification
- Classification with Non-linear Decision Boundaries
- Kernel Trick
- Polynomial and Radial
- Tuning Hyper parameters for SVM
- Gamma, Cost and Epsilon
- SVMs with More than Two Classes
Case Study: A Case Study on SVM using Python
- PROJECT: A project on a use case will challenge the Data Understanding, EDA, Data Processing and above Classification Techniques
UN-SUPERVISED LEARNING
- Why Unsupervised Learning
- How it Different from Supervised Learning
- The Challenges of Unsupervised Learning
Principal Components Analysis
- Introduction to Dimensionality Reduction and it's necessity
- What Are Principal Components?
- Demonstration of 2D PCAand 3D PCA
- EigenValues, EigenVectors and Orthogonality
- Transforming Eigen values into a new data set
- Proportion of variance explained in PCA
- Case Study: A Case Study on PCA using Python
K-Means Clustering
- Centroids and Medoids
- Deciding optimal value of 'k' using Elbow Method
- Linkage Methods
Hierarchical Clustering
- Divisive and Agglomerative Clustering
- Dendrograms and their interpretation
- Applications of Clustering
- Practical Issues in Clustering
- Case Study: A Case Study on clusterings using Python
Recommendation Systems
- What are recommendation engines?
- How does a recommendation engine work?
- Data collection
- Data storage
- Filtering the data
- Content based filtering
- Collaborative filtering
- Cold start problem
- Matrix factorization
- Building a recommendation engine using matrix factorization
- Case Study
MODULE 6: DEEP LEARNING
Introduction to Neural Networks
- Introduction to Perceptron & History of Neural networks
- Activation functions a)Sigmoid b) Relu c)Softmax d)Leaky Relu e)Tanh
- Gradient Descent
- Learning Rate and tuning
- Optimization functions
- Introduction to Tensorflow
- Introduction to keras
- Back propagation and chain rule
- Fully connected layer
- Cross entropy
- Weight Initialization
- Regularization
TensorFlow 2.0
- Introducing Google Colab
- Tensorflow basic syntax
- Tensorflow Graphs
- Tensorboard
Artificial Neural Network with Tensorflow
- Neural Network for Regression
- Neural Network for Classification
- Evaluating the ANN
- Improving and tuning the ANN
- Saving and Restoring Graphs
MODULE 7: CNN & COMPUTER VISION
UNIT 1: Working with images & CNN Building Blocks
- Working with Images_Introduction
- Working with Images - Reshaping understanding, size of image understanding pixels Digitization, Sampling, and Quantization
- Working with images - Filtering
- Hands-on Python Demo: Working with images
- Introduction to Convolutions
- 2D convolutions for Images
- Convolution - Backward
- Transposed Convolution and Fully Connected Layer as a Convolution
- Pooling: Max Pooling and Other pooling options
UNIT 2: CNN Architectures and Transfer Learning
- CNN Architectures and LeNet Case Study
- Case Study: AlexNet
- Case Study: ZFNet and VGGNet
- Case Study: GoogleNet
- Case Study: ResNet
- GPU vs CPU
- Transfer Learning Principles and Practice
- Hands-on Keras Demo: SVHN Transfer learning from MNISTdataset
- Transfer learning Visualization (run package, occlusion experiment)
- Hands-on demo -T-SNE
UNIT 3: Object Detection
- CNN's at Work - Object Detection with region proposals
- CNN's at Work - Object Detection with Yolo and SSD
- Hands-on demo- Bounding box regressor
- Need to do a semantic segmentation project
UNIT 4: CNN's at Work - Semantic Segmentation
- CNNs at Work - Semantic Segmentation
- Semantic Segmentation process
- U-Net Architecture for Semantic Segmentation
- Hands-on demo - Semantic Segmentation using U-Ne
- Other variants of Convolutions
- Inception and Mobile Net models
UNIT 5: CNN's at work- Siamese Network for Metric Learning
- Metric Learning
- Siamese Network as metric learning
- How to train a Neural Network in Siamese way
- Hands-on demo - Siamese Network
MODULE 8: NATURAL LANGUAGE PROCESSING
Unit 1: Introduction to Statistical NLP Techniques
- Introduction to NLP
- Preprocessing, NLP Tokenization, stop words, normalization, Stemming and lemmatization
- Preprocessing in NLPBag of words, TF-IDF as features
- Language model probabilistic models, n-gram model and channel model
- Hands on NLTK
Unit 2 : Word embedding
- Word2vec
- Golve
- POS Tagger
- Named Entity Recognition(NER)
- POS with NLTK
- TF-IDF with NLTK
Unit 3: Sequential Models
- Introdcution to sequential models
- Introduction to RNN
- Intro to LSTM
- LSTM forward pass
- LSTM backprop through time
- Hands on keras LSTM
Unit 4 : Applications
- Sentiment Analysis
- Sentence generation
- Machine translation
- Advanced LSTM structures
- Keras- machine translation
- ChatBot
Module 1: Generative AI and its Industry Applications Topics
- Generative AI Principles Types of Generative Models
- Applications of Generative Models Machine Learning Algorithms with GenAI Applications of Generative AI
- Generative AI: Advantages and Disadvantages Ethical Considerations
Module 2: NLP and Deep Learning Topics
- Natural Language Processing (NLP) Essentials Text Classification
- Text Preprocessing
- Basic NLP Tasks
- Deep Learning for NLP Neural Networks
- Backpropagation RNN
- LSTM
- Deep Learning Applications in NLP
Module 3: Autoencoders and GANs Topics
- Basic Autoencoders
- Variational Autoencoders (VAEs)
- Applications in Data Compression and Generation Basic GAN Architecture
- Training GANs Variants of GANs DCGAN
- Cycle GAN
Module 4: Language Models and Transformer-based Generative Models Topics
- Exploring Language Models Types of Language Models
- Applications of Language Models
- The Transformer Architecture: Attention Mechanism Advanced Transformer Models
- GPT BERT
- Applications of Transformer-based Models
Module 5: Prompt Engineering
- Prompt Engineering Principles What is Prompt Engineering? Importance and Applications Prompt Design Strategies Types of Prompting
- Crafting Effective Prompts Parameter Tuning
Module 6: Generative AI with LLMs Topics
- LLMs and Generative AI Project Lifecycle LLM Pre-Training and Scaling
- Fine-Tuning LLMs with Specific Instructions Efficient Fine-Tuning of Parameters Reinforcement Learning from Human Response
Module 7: LLMs for Search, Prediction, and Generation Topics
- Search Query Completion Next Word Prediction Word Embeddings Transformers
- Generating Text
- Stacking Attention Layers
Module 8: LangChain for LLM Application Development Topics
- Understanding Retrieval-Augmented Generation (RAG) Document Loading and Splitting
- Vector Stores and Embeddings Retrieval
- Question Answering with Chatbots Building RAG Models using LangChain
Module 9: Interacting with Data Using LangChain and RAG Topics
- Understanding Retrieval-Augmented Generation (RAG) Document Loading and Splitting
- Vector Stores and Embeddings Retrieval
- Question Answering with Chatbots Building RAG Models using LangChain
Module 10: Generative AI on Cloud Topics
- Cloud Computing Foundations AWS S3
- Amazon EC2 Trn1n Amazon EC2 Inf2
- Amazon Sagemaker
- Amazon CodeWhisperer Amazon Bedrock
- Azure OpenAI
Module 11: Working with ChatGPT Topics
- Introduction to ChatGPT
- Leveraging ChatGPT for Productivity Mastering Excel through ChatGPT
- Becoming a Data Scientist using ChatGPT Data Analysis in PowerBI with ChatGPT
- Creating a Content Marketing Plan
- Social Media Marketing using ChatGPT Keyword Search and SEO using ChatGPT Generating Content using ChatGPT
- Implementing ChatGPT for Customer Service ChatGPT for Developers
- ChatGPT for Creating Programs ChatGPT for Debugging
- ChatGPT for Integrating New Features ChatGPT for Testing
- Documenting the Code with ChatGPT
Module 12: Python with Generative AI Topics
- Python Code Generation with Generative AI Gen AI Tools for Coding
- Advanced Code Optimization with ChatGPT Gen AI Tool Coding with ChatGPT
- Building an Application in Python with ChatGPT
Module 13: Evaluating LLM Performance Topics
- LLM Performance Comparison Perplexity
- BLEU Score Human Evaluation
- Choosing the Right Metrics Interpreting the Results
Module 14: Industry Case Studies and In-class Project Topics
- In-class Project: AI-Powered Text and Image Generator Case Study: Generative AI for Personalized Learning
- Case Study: Generative AI for Creative Content Generation Case Study: Generative AI for Business
Module 15: Bonus Module - Machine Learning with Generative AI
- Artificial Intelligence Essentials Disciplines of AI
- Types of AI
- Machine Learning Fundamentals Predictive ML Models
- ML Algorithms: Deep Dive Supervised Learning Unsupervised Learning
- Semi-Supervised Learning Reinforcement Learning
Module 16: Bonus Module - Generative AI Tools Topics:
- Hugging Face Transformers OpenAI GPT3 API
- Google Cloud AI Platform Midjourney
- DALL E- 2
- Google Bard