Machine Learning for Analytics
MasterTrack® Certificate
Advance your career with graduate level skills in data analytics, data
science, statistics and machine learning.
Program overview
Duration Total tuition Location
5 months to complete $4000 100% online
In 4 installments of $1000 or + Live session classes
pay all at once to save $200.
There are 4 Courses in this MasterTrack®
Certificate
Courses and projects are subject to change.
Course 1 Syllabus
Statistical thinking for machine learning
This is a 6 week course that provides foundational knowledge in statistical thinking and
introduces you to thinking critically about data analytics. You’ll begin working on a case
study developed at the University of Chicago in which you’ll use a proprietary dataset to
make real-work insights using statistical techniques.
Week 1
Introduction to the course. Working with data
In this week you will start honing a skill Learning objectives:
that you will need throughout your entire ● Explore the contents of unfamiliar
career in machine learning: if your goal is dataframes in Python
to uncover relationships in data and build ● Subset data frames that you want to use to
models from data, then you need to be manipulate only the variables of interest
able to understand and prepare the data ● Identify different data types
that you want to use in the first place. ● Create new vectors in a dataframe by
compositing existing ones
● Capture new data for analysis
Week 2
Introduction to Probability and Distributions
During this second week, you start Learning objectives:
learning techniques that will help you
● Recognize basic distributions, including
understand your data, and in particular
Normal and Binomial Distributions
the probability of getting certain values in
● Calculate basic probabilities
your data.
● Calculate the probability density functions
and cumulative distribution functions for
When you start creating models using
values on binomial and normal distributions
statistics and machine learning, you'll use
● Explain what the central limit theorem is and
these very same techniques to tell whether
what its implications are
the variables that your models are using
● Simulate common distributions using Python
have a statistically significant relationship
with what you're trying to predict.
Week 3
Descriptive analytics
Understanding the data that you’re dealing Learning objectives:
with is crucial to build models. This week
● Explain the principles and uses for different
shows you how to calculate
measurements of location and dispersion
measurements of central tendency,
● Calculate measurements of location and
describe qualitatively and quantitatively
dispersion for a given dataset
the shape of your data, and identify and
● Visualize your data for the purpose of
address outliers.
understanding and describing it
● Read a visualization and interpret the
underlying data
● Use measures of dispersion to identify
possible outliers in your data
Week 4
Statistical Tests and Causations
This module shows you how to test the Learning objectives:
relationship between variables and say,
● Explain the assumptions and methods of
with a certain level of confidence, that
hypothesis testing
there is a statistically significant
● Conduct t-tests, z-tests, ANOVA, and
relationship.
Chi-Squared tests to identify possible
statistical significance
You will be able to identify which property
● Explain basic requirements for making
values actually matter and which will wind
causal claims
up just being noise in your model.
Week 5
Bivariate OLS
During this week, you learn your first Learning objectives:
modeling technique, bivariate OLS, which
● Explain the requirements for your estimator
you can use to make simple linear
to be unbiased
regressions--if certain conditions are met.
● Create a linear regression using a single
dependent and independent variable
This module emphasizes the requirements
● Calculate the p-values of your estimators
that need to be met for you to use this
● Measure the goodness of fit of your
modeling technique, and the
regression
measurements that you can use to
● Explain the principles, assumptions, and
determine the quality of your model.
calculations of OLS
Week 6
Final Project: Critical Analytics and Linear Modeling
Throughout, you will work on a proprietary dataset developed from public and private data about
property values in Chicago, including data from the Cook County Assessor’s office.
As you will learn during this course, there are high stakes involved in predicting property values in
Cook County. Billions are generated from property taxes each year, but this has not always been
done in the most accurate and equitable way. In the first project, you will use the skills and tools
that you have developed to create a model that predicts home values in Cook County.
Course 2 Syllabus
Advanced Statistical Thinking for Machine Learning
In this 6 week course, you will learn to work with more sophisticated datasets and interpret
them using more advanced statistical techniques, such as multivariate OLS, transforming
variables, and advanced binary classification.
Week 1
Multivariate OLS
This model introduces you to the course Learning objectives:
and shows you how to create multivariate
● Describe the assumptions of multivariate
linear regressions.
OLS
● Specify multivariate regressions.
You learn the assumptions that go into
● Explain the role of omitted variable bias may
creating them, the role of omitted variable
play in your model
bias, how to specify them, and how to
● Measure the goodness of fit of your
measure their goodness of fit and conduct
multivariate regression
hypothesis testing.
● Conduct single and joint hypothesis tests of
the variables in your model
Week 2
Variable transformation
This module focuses on discussing variable Learning objectives:
transformations. It shows you how to
● Describe the effects of affine
create and interpret regressions using OLS
transformations on variables in your model
that incorporate curvature.
● Create curvilinear regressions using log and
polynomial transformations of variables
● Interpret the effects of log and polynomial
variables in your model
● Properly incorporate dummy variables into
your model
● Create interactions in your model
Week 3
Binary Classification
This week turns from regression to Learning objectives:
classification. You will learn to create
● Describe the assumptions and principles
binary classifiers, explore three different
around linear classification using LPMs
types of classifier, LPM, probit, and logit,
probit, and logit
and learn how to interpret them.
● Create classifiers using LPM, probit and logit
● Interpret the PEA and APE of your classifier
● Critically analyze uses of binary classifiers
Week 4
Missing data and Outliers
In week four you look at different kinds of Learning objectives:
missing data and outliers.
● Identify and manage outliers
● Explain the effects of endogenous
Most real datasets that you use are going
missingness
to be incomplete or have less data than
● Explain the effects of exogenous missingness
you would want, have noise against the
● Calculate standard errors in the presence of
signal, or have unusual values. You will
heteroskedasticity
learn to manage these and take into
account their influence on your models.
Week 5
Natural experiments
This week introduces you to experiment Learning objectives:
design and natural experiments that use
● Design natural experiments using DiD
Diff-in-Diff in particular.
● Critique studies that use natural experiments
to investigate the effects of treatment
You see how to conduct experiments that
● Evaluate the effects of treatment in a DiD
measure the effect of some treatment
study using regression
even in the absence of an RCT and where
it can be beneficial to model these as
regressions.
Week 6
Final Project: Advanced modeling and Classification
Building on your work with property value prediction you will now receive additional data on
properties from multiple sources, utilize more complex data preparation techniques, and apply
more advanced regression procedures.
This time, you will face a new challenge: how to model the odds of individuals appealing the
assessed value of their homes. This will entail trying to understand whether there are statistically
significant relationships between appeals and factors like education, wealth, and race. This
challenge was one of the most hotly-contested political issues in the past years in Chicago.
Course 3 Syllabus
Introduction to Machine Learning
This course will introduce you to Machine Learning as a discipline and build on the
statistical techniques that you have learned. You’ll also understand machine learning as a
discipline with its own mode of thinking where practitioners train models to create
predictions that are used in a growing number of analytical applications.
This and the next Machine Learning course take the same approach: teaching you how an
algorithm works under the hood--often times by showing you how to code it from
scratch--and then introducing you to tools that will apply them effectively and efficiently.
Week 1
Learning and machine learning
This week introduces you to the principles and Learning objectives:
process of machine learning. You learn about
● Distinguish between supervised and
the different broad categories of machine
unsupervised machine learning
learning, the machine-learning pipeline, and
● Divide data into train and test sets
how to evaluate models on new data.
● Create and evaluate a k-nearest
neighbors model
You also learn how to create a k-Nearest
● Describe the roles of loss and risk in
Neighbors model more or less from scratch and
training models
how to derive the closed-form solution to OLS.
● Derive and apply the closed-form
solution to OLS
Week 2
Statistical Approaches to Regression
In this module, you will learn how to use
● Use gradient descent to find the
gradient descent to minimize other loss
parameters for models without
functions.
closed-form solutions
● Use cross-validation to evaluate your
You also study the bias-variance tradeoff and
models
explore some techniques for using
● Explain the bias-variance tradeoff
cross-validation to get the best sense of how
our model will perform on new data.
Week 3
Regularization and Classification
This module is split into two parts. In the first,
● Describe the difference and different
you will learn about two types of regularization,
uses of L1 and L2 regularization
why they may improve your models, and how
● Explain how regularization helps
to implement them in linear regressions.
prevent overfitting
● Create, train, and evaluate logistic
In the second, you will return to logistic
models
regression. You will learn how to derive its
● Implement regularization methods in
parameters using a new kind of gradient
linear regressions
descent, and then explore using logistic
regression to create a classifier that uses
natural language from Tweets to predict
whether a Twitter account is a bot or a
legitimate user.
Week 4
Multiclass Classification and SVMs
In the first part of this module, you will learn
● Use one-vs-all technique for multiclass
how to use softmax for multiclass classifications
classification
and how to implement it using gradient
● Train a linear SVM using stochastic
descent. Then, you will look at a new kind of
gradient descent
classifier, support vector machines, and learn
● Describe the differences between linear
how to implement both linear SVMs and kernel
and kernel SVMs
SVMs.
● Train a softmax classifier using
stochastic gradient descent
● Tune the hyperparameters for SVMs
Week 5
Density Estimation
In this module, you learn a new approach to
● Distinguish between generative and
solving machine-problems, creating generative
discriminative methods
models.
● Implement quadratic discriminative
functions
This approach entails estimating the
● Implement linear discriminative
distribution of the data in the first place. Here,
functions
you focus on using this technique for
● Implement naive Bayes classifiers
classification, but it can also be used for
unsupervised learning and even for generating
new data from the training data.
Week 6
Final Project: Machine-Learning Models for Prediction
This project focuses on combining statistical techniques with new approaches from machine
learning.
Your goal in this project is to make predictions. You will work from a larger, enhanced dataset that
now contains new information, such as the number of stores within a certain radius of a property.
You are expected to test and fine tune your models to optimize their performance on new data so
that you can predict whether a homeowner will appeal and estimate property values.
Course 4 Syllabus
Advanced applications
In this final course, you will learn additional machine learning techniques, including neural
networks. You will explore how to fine tune the models that you have been creating, more
advanced ways to manipulate your data, and more sophisticated approaches, such as
ensemble methods.
Week 1
Decision Trees and Ensembles
In this module, you learn about decision trees
● Describe the fundamental mechanics
and methods to create models that combine
underlying decision trees
the predictions of other models.
● Implement methods to prevent decision
trees from overfitting
You will learn how to create a decision tree
● Train ensembles of weak learners to
from scratch in Python, and then how to use
perform as well--or better--than strong
methods to prevent overfitting. Then, you will
learners
understand how to create ensembles of weak
● Distinguish among different approaches
learners that can perform as well as a strong
to training ensembles
learner.
Week 2
Natural Networks
This week introduces you to the hottest
● Describe the basic mechanics of
technique in machine learning today, neural
perceptrons, multi-layered perceptrons,
networks. You will look at the most basic and
and convolutional neural networks
fundamental piece of a neural network and
● Identify applications for MLPs and CNNs
perceptron.
● Design effective architectures for MLPs
and CNNs
Additionally, you will look at convolutional
● Implement MLPs and CNNs using Keras
neural networks: how they work, where they
may be most usefully implemented, and how to
design their architecture.
Week 3
Unsupervised learning
We switch gears in this module from supervised
● Differentiate between supervised and
learning to unsupervised learning. In
unsupervised learning
unsupervised learning, we're trying to uncover
● Describe the mechanics and properties
hidden patterns and relationships within the
of different clustering techniques
data.
● Identify where K-Means clustering,
hierarchical clustering, spectral
You will learn how to implement four important
clustering, and GMMs may be most
unsupervised-learning techniques, K-Means
usefully applied
clustering, spectral clustering, hierarchical
● Choose the best unsupervised modeling
clustering, and Gaussian Mixture Models.
technique and optimize its
hyperparameters
Week 4
Dimensionality reduction
In this module, you’ll learn techniques for
● Use dimensionality reduction for
dimensionality reduction: projecting data from
feature selection
high-dimensional space in low-dimensional
● Implement PCA to reduce the number
space.
of features in your data
● Visualize high-dimensional data in
You will also learn about kernel density
two-dimensional space using t-SNE
estimation, a technique to visualize a variable's
● Implement kernel density estimation to
distribution and estimate the probability
estimate pdf
density function for different values.
Week 5
The Practice of Machine Learning
In this module, you're introduced to some tools
● Apply hard negative mining to address
that will better equip you to address
imbalanced training data
machine-learning tasks. You’ll learn some
● Implement wrapper and filter methods
techniques to assist you with feature selection,
for feature selection
as well as how to identify and address
● Implement self-training for
imbalanced or unbiased datasets.
semi-supervised learning
● Describe the basic approaches of
You are also introduced to semi-supervised
semi-supervised learning
learning, which combines techniques from
unsupervised and supervised learning.
Week 6
Final Project: Beat the Assessor
In this project you will have the opportunity to apply the advanced modeling techniques that you
have learned during the program to the same dataset that the data scientists at the Cook County
Assessor use to create their models. You will be given a technique to measure how effective your
predictions are, and you will be tasked with creating a modeling technique whose performance, by
that metric, exceeds the Cook County Assessor’s.
Program features
Live Sessions
You will be able to attend live sessions with instructors and get a chance to collaborate
with a cohort of peers to help build your professional network. Live sessions are
optional and made available to watch in recordings if you are unable to attend.
Hands-on projects
You will have the opportunity to work on industry- relevant projects to build your portfolio
and enhance your experience.
Instructors
Austin L. Wright
Assistant professor at the Harris School for Public Policy.
Greg Shakhnarovich
Associate Professor at the Toyota Technological Institute
at Chicago and Associate Professor Part-Time in the
Department of Computer Science.
Gregory Bernstein
Instructor, Master of Science in Analytics and Data
Scientist, Kinexon Sports & Media
Shree Bharadwaj
Instructor, Master of Science in Analytics.
Earn credit towards a Master degree in Science
Analytics
Upon completing your MasterTrack® program you can apply to the Master of Science in
Analytics from the University of Chicago. If you are admitted to the part-time Master
your MasterTrack® Certificate will count towards your degree.
About the University of Chicago
The University of Chicago is an urban research university that has driven new ways of
thinking since 1890. Its commitment to free and open inquiry draws inspired scholars to its
global campuses, where ideas are born that challenge and change the world. The University
of Chicago is ranked #3 in the United States, and that high quality and rigor is a trademark
of its MScA program. In all it does, it is driven to dig deeper, push further, and ask bigger
questions—and to leverage its knowledge to enrich all human life.
You can learn more about MasterTrack® Certificates here.