Data Analyst Nanodegree Syllabus
Discover Insights from Data with Python, R, SQL, and Tableau
Before You Start
Prerequisites: In order to succeed in this program, we recommend having experience working with data in
SQL and/or a spreadsheet tool like Microsoft Excel. You should also have a good understanding of
descriptive statistics, including how to calculate and interpret measures of center (mean, median, mode);
measures of spread (standard deviation, 5-number summary); and build bar charts, histograms, boxplots,
and scatterplots.
Educational Objectives: Learn to organize data, uncover patterns and insights, draw meaningful
conclusions, and clearly communicate critical findings. Learn to use Python, R, SQL and Tableau. Gain all the
skills necessary to get a job as a data analyst.
Program Design
Length of Program*: The program is divided into two terms of three months each (approx. 13 weeks). We
expect students to work 10 hours/week on average. Estimated time commitment is 130 hours per term.
Textbooks required: None
Instructional Tools Available: Video lectures, personalized project reviews, live chat help, dedicated mentor
*The length is an estimation of total hours the average student may take to complete all required
coursework, including lecture and project time. Actual hours may vary.
TERM 1: DATA ANALYSIS WITH PYTHON AND SQL
Intro Project: Explore Weather Trends (5 hrs)
This project will introduce you to the SQL and how to download data from a database. You’ll analyze local
and global temperature data and compare the temperature trends where you live to overall global
temperature trends.
Project: Explore US Bikeshare Data (40 hrs)
You will use Python to answer interesting questions about bikeshare trip data collected from three US cities.
You will write code to collect the data, compute descriptive statistics, and create an interactive experience in
the terminal that presents the answers to your questions.
Supporting Lesson Content: Introduction to Python Programming
Lesson Title Learning Outcomes
WHY PYTHON ➔ Learn why we program
PROGRAMMING ➔ Prepare for the course ahead with a detailed topic overview
➔ Understand how programming in Python is unique
DATA TYPES AND ➔ Understand how data types and operators are the building
OPERATORS blocks of programming in Python
➔ Use the following data types: integers, floats, booleans, strings,
lists, tuples, sets, dictionaries
➔ Use the following operators: arithmetic, assignment,
comparison, logical, membership, identity
CONTROL FLOW ➔ Implement decision making in your code with conditionals
➔ Repeat code with for and while loops
➔ Exit a loop with break and skip an iteration of a loop with
continue
➔ Use helpful built-in functions like zip and enumerate
➔ Construct lists in a natural way with list comprehensions
FUNCTIONS ➔ Write your own functions to encapsulate a series of commands
➔ Understand variable scope, i.e., which parts of a program
variables can be referenced from
➔ Make functions easier to use with proper documentation
➔ Use lambda expressions, iterators, and generators
SCRIPTING ➔ Write and run scripts locally on your computer
➔ Work with raw input from users
➔ Read and write files, handle errors, and import local scripts
➔ Use modules from the Python standard library and from
third-party libraries
➔ Use online resources to help solve problems
Project: Investigate a Dataset (40 hrs)
In this project, you’ll choose one of Udacity's curated datasets and investigate it using NumPy and pandas.
You’ll complete the entire data analysis process, starting by posing a question and finishing by sharing your
findings.
Supporting Lesson Content: Introduction to Data Analysis
Lesson Title Learning Outcomes
Data Analysis in Python
DATA ANALYSIS PROCESS ➔ Learn about the keys steps of the data analysis process
➔ Investigate multiple datasets using Python and Pandas
PANDAS AND NUMPY: ➔ Perform the entire data analysis process on a dataset
CASE STUDY 1 ➔ Learn to use NumPy and Pandas to wrangle, explore, analyze,
and visualize data
PANDAS AND NUMPY: ➔ Perform the entire data analysis process on a dataset
CASE STUDY 2 ➔ Learn more about NumPy and Pandas to wrangle, explore,
analyze, and visualize data.
Introduction to SQL
Basic SQL ➔ Write common SQL commands including SELECT, FROM, and
WHERE, as well as corresponding logical operators
SQL Joins ➔ Write JOINs in SQL, as you are now able to combine data from
multiple sources to answer more complex business questions
SQL Aggregations ➔ Write common aggregations in SQL including COUNT, SUM, MIN,
and MAX
➔ Write CASE and DATE functions, as well as work with NULLs
Advanced SQL Queries ➔ Edit a database using CREATE TABLE, INSERT INTO, UPDATE, and
other statements
➔ Use window functions and subqueries to add steps to a query
➔ Use documentation to learn new functions and complete
complex tasks
Project: Analyze Experiment Results (45 hrs)
In this project, you will be provided a dataset reflecting data collected from an experiment. You’ll use
statistical techniques to answer questions about the data and report your conclusions and
recommendations in a report.
Supporting Lesson Content: Practical Statistics
Lesson Title Learning Outcomes
STANDARDIZING ➔ Convert distributions into the standard normal distribution
using the Z-score
➔ Compute proportions using standardized distributions
NORMAL DISTRIBUTION ➔ Use normal distributions to compute probabilities
➔ Use the Z-table to look up the proportions of observations
above, below, or in between values
SAMPLING ➔ Apply the concepts of probability and normalization to sample
DISTRIBUTIONS data sets
ESTIMATION ➔ Estimate population parameters from sample statistics using
confidence intervals
HYPOTHESIS TESTING ➔ Use critical values to make decisions on whether or not a
treatment has changed the value of a population parameter
T-TESTS ➔ Test the effect of a treatment or compare the difference in
means for two groups when we have small sample sizes
REGRESSION ➔ Build a linear regression model to understand the relationship
between independent and dependent variables
➔ Use linear regression results to make a prediction
TERM 2: ADVANCED DATA ANALYSIS
Intro Project: Test a Perceptual Phenomenon (10 hrs)
In this project, you’ll use descriptive statistics and a statistical test to analyze the Stroop effect, a classic
result of experimental psychology. Communicate your understanding of the data and use statistical
inference to draw a conclusion based on the results.
Supporting Lesson Content: Practical Statistics
Project: Wrangle and Analyze Data (50 hrs)
Real-world data rarely comes clean. Using Python, you'll gather data from a variety of sources, assess its
quality and tidiness, then clean it. You'll document your wrangling efforts in a Jupyter Notebook, plus
showcase them through analyses and visualizations using Python and SQL.
Supporting Lesson Content: Data Wrangling
Lesson Title Learning Outcomes
INTRO TO DATA ➔ Identify each step of the data wrangling process (gathering,
WRANGLING assessing, and cleaning)
➔ Wrangle a CSV file downloaded from Kaggle using fundamental
gathering, assessing, and cleaning code
GATHERING DATA ➔ Gather data from multiple sources, including gathering files,
programmatically downloading files, web-scraping data, and
accessing data from APIs
➔ Import data of various file formats into pandas, including flat
files (e.g. TSV), HTML files, TXT files, and JSON files
➔ Store gathered data in a PostgreSQL database
ASSESSING DATA ➔ Assess data visually and programmatically using pandas
➔ Distinguish between dirty data (content or “quality” issues) and
messy data (structural or “tidiness” issues)
➔ Identify data quality issues and categorize them using metrics:
validity, accuracy, completeness, consistency, and uniformity
CLEANING DATA ➔ Identify each step of the data cleaning process (defining, coding,
and testing)
➔ Clean data using Python and pandas
➔ Test cleaning code visually and programmatically using Python
Project: Explore and Summarize Data (50 hrs)
In this project, you’ll use R and apply exploratory data analysis techniques to explore a selected data set for
distributions, outliers, and anomalies.
Supporting Lesson Content: Data Analysis with R
Lesson Title Learning Outcomes
WHAT IS EDA? ➔ Define and identify the importance of exploratory data analysis
(EDA)
R BASICS ➔ Install RStudio and packages
➔ Write basic R scripts to inspect datasets
EXPLORE ONE VARIABLE ➔ Quantify and visualize individual variables within a dataset
➔ Create histograms and boxplots
➔ Transform variables
➔ Examine and identify tradeoffs in visualizations
EXPLORE TWO VARIABLES ➔ Properly apply relevant techniques for exploring the relationship
between any two variables in a data set
➔ Create scatter plots
➔ Calculate correlations
➔ Investigate conditional means
EXPLORE MANY ➔ Reshape data frames and use aesthetics like color and shape to
VARIABLES uncover information
DIAMONDS AND PRICE ➔ Use predictive modeling to determine a good price for a
PREDICTIONS diamond
Project: Create a Tableau Story (20 hrs)
In this project, you’ll create a data visualization, using Tableau, from a data set that tells a story or highlights
trends or patterns in the data. Your work should be a reflection of the theory and practice of data
visualization, harnessing visual encodings and design principles for effective communication.
Supporting Lesson Content: Data Visualization with Tableau
Lesson Title Learning Outcomes
DATA VISUALIZATION ➔ Understand the importance of data visualization
FUNDAMENTALS ➔ Know how different data types are encoded in visualizations
DESIGN PRINCIPLES ➔ Select the most effective chart or graph based on the data
being displayed
➔ Use color, shape, size, and other elements effectively
CREATING ➔ Become proficient in basic Tableau functionality, including
VISUALIZATIONS WITH charts, filters, hierarchies, etc.
TABLEAU ➔ Create calculated fields in Tableau
TELLING STORIES WITH ➔ Create Tableau dashboards and stories to effectively
TABLEAU communicate data