Python for
Data Science
Designing and Writing High-Performing
Python Code
The demand for specialists who can manipulate and analyze high
volumes of data is swiftly increasing on a global scale. As the field
progresses, it is crucial for practitioners to understand how to
design and write high-performing Python code to perform the
tasks required by organizations across industries.
Python for Data Science
In a world where data is considered a commodity, data science
practitioners need to have a greater understanding of the components
of designing and writing Python code, while utilizing this code in
creative and relevant ways to solve real-world problems.
Program Details
Tuition: USD $2,200
Program Format: Remote learning with live, interactive sessions
Duration: Eight weeks
Language: English
Instructor: Brian Craft, BA (Cum Laude), MSc; Lecturer for Python of
Analytics, University of Chicago
About the Program
Our eight-week Python for Data Science program introduces the basic
concepts of Python as a programming language. This highly technical
program is project-based at its core and you will be presented with
many practical examples before being afforded an opportunity to
create and run your own Python projects.
You will learn to:
• Understand Python language.
• Perform advanced data analysis and manipulation.
• Write production level Python code.
• Train and evaluate machine learning models.
• Design and optimize Python code for performance and speed.
• Write Python code to efficiently process large data sets.
• Prepare machine learning models for production use.
Who Should Attend?
This program is designed for technical professionals and
professionals who:
• Have a rudimentary knowledge of Python and machine learning.
• Are eager to learn about data science but have not found the
proper structure.
• Are looking to transition to Python.
• Are business intelligence analysts with a strong foundation on
the theory of data analysis and manipulation, but have limited
Python exposure.
• Work with a quantitative mind but no technical toolkit.
• Are data analysts working predominantly in Excel.
Connect with Expert Instructors
Courageous thinkers and passionate teachers, our instructors are an
active community of scholars. Propelled by rigorous debate and
cross-disciplinary collaboration, they produce ideas that matter and
enrich human life.
Meet the Instructor
Brian Craft is a seasoned data scientist with years of
industry experience. In his role as a data scientist at
Conagra Brands, he focuses on scaling their machine
learning capabilities and develops models to understand
consumer purchase behavior and identify emerging
ingredient and flavor trends.
Brian Craft, BA (Cum Laude), MSc; Lecturer for Python
of Analytics, University of Chicago
Why the University of Chicago?
Becoming a member of the University of Chicago community means
gaining access to world-class instructors and a cohort of curious,
diverse individuals.
Through a firm grounding in core principles and a rigorous approach to
problem-solving, our teaching method—the Chicago Approach—will
give you the tools you need to make sense of complex data and turn
ideas into impact. Program participants will receive a certificate of
completion and join a global network of thought leaders.
Approach to Remote Learning
Our remote learning programs are crafted to support your specific
professional development goals. Programs combine e-learning with
live, interactive sessions to strengthen your skill set while maximizing
your time. We couple academic theory and business knowledge with
practical, real-world application.
Through remote learning sessions, you will have an opportunity to
interact with University of Chicago instructors and your peers.
Program Outline
The Python for Data Science program covers the following topics:
Module 1: Foundational Python Functionality
• Variable Declaration, Mathematical and Logical Operations,
Datatypes, and Containers
• Conditionals (If, Elif, Else)
• Iteration (For and While Loops)
• Comprehensions
• Exceptions and Error Handling
Module 2: Classes and User Defined Functions
• User Defined Functions (UDFs)
• Enhancing UDFs
• Introduction to Classes
• Developing a Linear Regression Class
Module 3: Basic Data Analysis and Manipulation
• Load in External Data Using Pandas
• Manipulating Dataframes and Numpy Arrays
• Pandas and Numpy in Use
• Joining and Concatenating Dataframes Using Pandas
• Visualizing Data Using Seaborn
Module 4: Advanced Data Analysis and Manipulation
• Broadcasting
• Matrix and Elementwise Operations for Efficient Data Manipulation
• Advanced Pandas Concepts for Dataframe Manipulation
• Distance Metrics, Distance Matrices and Distributions Using Scipy
Module 5: Training and Evaluation of Machine Learning
(ML) Models
• Data Normalization Using Sklearn
• Feature Extraction Using Sklearn
• Training Prediction, Classification and Unsupervised ML Models
with Sklearn
• Model Evaluation for Prediction, Classification and Unsupervised
Learning
• Cross Validation
Module 6: Parallelizing Model Training and Programs Using
Multiprocessing
• Parallelization Overview
• N_jobs Parameter and a Random Forest in Skelarn
• Parallelized Grid Search in Sklearn
• Multiprocessing in Python
• Developing a Random Forest using Sklearn Decision Tree and
Multiprocessing Module
Module 7: Parallelizing Web Scrapers Using Multithreading
• Multithreading vs. Multiprocessing
• Multithreading in Python
• Web Scraping with the Request Module
• Parallelized Web Scraping Using Multithreading
Module 8: Model Deployment
• Training and Saving Sklearn Models
• What are APIs and Creating a Basic Flask API
• Developing a Flask API to Make Predictions Using a Saved
ML Model
• Batch Scoring Using Multiprocessing
Program outline may be subject to change based on academic adjustments.
Learn more
To schedule an appointment with admissions, contact
admissions@online.professional.uchicago.edu.
Visit online.professional.uchicago.edu to learn more.