Unit 1 | Week (1 - 4)
Planning and Thinking Skills for Architecting Data Science
Solutions
10V's of data, Understanding Classification, Segmentation, Regression and Optimization (The
general tasks of a Data Scientist)
Understanding Statistical (Discriminative and Generative), Non-Parametric (Instance Based and
Iterative) Models Graphically
The Latest Trends: Sub-Space, Spectral, Kernel and Neural Networks
Unit 2 | Week (5 - 8)
Foundation Courses
Data Analytics in Excel - foundation to dashboarding
Visualization using Tableau
Python / R Programming - coding structures, data handling, control structures, etc.
Unit 3 | Week (9 - 12)
Statistical modeling & EDA for Predictive Analytics
Analytics Problem Solving - CRISP-DM Framework for business problem solving
Probabilities, joint and conditional probabilities, simulations and estimations. Introduction to
gaussian mixtures and anomaly detection
Data types, basic probabilities, Probability distributions (Discrete and Continuous) -Bernoulli,
Binomial, Multinomial and Poisson distribution
Describing the relationship between attributes: Covariance; Correlation; ChiSquare
Special emphasis on Normal distribution; Central Limit Theorem
Inferential stats: t, f chi-square testing
Inferential statistics: How to learn about the population from a sample and vice versa; Sampling
distributions; Confidence Intervals, Hypothesis Testing.
Case Study - Uber Supply Gap - summarize and visualize your solutions using Uber supplydata.
Unit 4 | Week (13 - 16)
Data Pre-Processing
Introduction to R/Python, Binning, Standardization, Normalization
Type Conversion, Merging
Normal Curves, Central Tendency and Outlier Detection
Dimensionality Reduction: PCA, SVD approaches
Handling Missing Values (K-NN, MI, Clustering etc.)
Unit 5 | Week (17 - 20)
Data Visualization in R / Python
Data Exploration - Histograms, Bar Chart, Box Plot, Line Graph, Scatter Plot
Data Storytelling - The Science, ggplot, Bubble Charts with Multiple Dimensions, Gauge Charts,
Treemap, Heat Map and Motion Charts
Linear Regression
Approach: Model Estimation, MLE & Error Function, Optimization through Gradient Descent for
finding parameters
Constructing a Linear Regression, Diagnostics
Interpretation and Applications
Case Study 1 - Help a digital media company understand why their viewership is falling and
propose recommendations to increase viewership
Case Study 2 - Create a model to understand the factors that influence car prices in the US.
Unit 5 | Week (17 - 20)
Decision Trees
Rule Based Knowledge: Logic of Rules, Evaluating Rules, Rule Induction and Association
Rules.
Construction of Decision Trees through Simplified Examples; Choosing the "Best" attribute at
each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees.
Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical
Variables; other Measures of Randomness
Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules Oblique Decision
Trees
Oblique Decision Trees
Case Study - Predict whether a customer will default on loan or not
Instance based learning
K-NN method, wilson editing and triangulation
K-NN in collaborative filtering, digit recognition
Ensembles
Methods of Ensembling (Stacking, Mixture of Experts)K-NN in collaborative filtering, digit
recognition
Bagging and Random forest (Logic, Practical Applications)
Ada Boost
Gradient Boosting Machines
Unit 6 | Week (21 - 24)
Discriminative Statistical Models: Logistic Regression
Why Linear Regression Fails and Logit Function
Approach: Model Estimation, MLE & Error Function, Optimization through Gradient Descent for
finding parameters
Constructing Logistic Regression, Diagnostics
Interpretation and Applications
Case Study 1 - Predict employee attrition in a large organization.
Case Study 2 - Predict whether the customers will buy a life insurance policy using a large
insurer's past customer data.
Time Series
Regression on Time.
Modeling Seasonality as Deviation
Statistician's Approach: Components of a Time Series and Estimation Methods
Smoothing: Moving Average, Weighted and Exponential Moving
Holt Winters Method
Box-Jenkins and ARIMA
Case Study - Forecast gold prices using past 30 years data.