KEMBAR78
Chapter 1 Lecture Slides | PDF
0% found this document useful (0 votes)
35 views22 pages

Chapter 1 Lecture Slides

This document discusses different types of graphs that can be used to visualize data distributions, including pie charts, bar graphs, histograms, and stem plots. It covers categorical variables that can be shown with pie charts or bar graphs, and quantitative variables that can be depicted using histograms or stem plots. The document provides examples and explanations of how to interpret these various graph types and analyze distributions.

Uploaded by

Daneen Baig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views22 pages

Chapter 1 Lecture Slides

This document discusses different types of graphs that can be used to visualize data distributions, including pie charts, bar graphs, histograms, and stem plots. It covers categorical variables that can be shown with pie charts or bar graphs, and quantitative variables that can be depicted using histograms or stem plots. The document provides examples and explanations of how to interpret these various graph types and analyze distributions.

Uploaded by

Daneen Baig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Chapter 1

Picturing Distributions with Graphs

Lecture Slides

© 2021 W. H. Freeman and Company


In Chapter 1 we cover …
Individuals and variables
Categorical variables: pie charts and bar
graphs
Quantitative variables: histograms
Interpreting histograms
Quantitative variables: stemplots
Time plots
Statistics
Statistics is the science of data. The first step in
dealing with data is to organize your thinking
about the data.
Individual: an object described by a set of data
Variable: a characteristic of the individual
When planning a study …
… or simply exploring data from someone else’s
work, ask yourself these questions:
Who? What individual does the data describe?
What? How many and what are the exact definitions of the
variables in the data?
In what unit of measurement is each variable
recorded?
Where? The context of the data collection is always important.
When? (see previous point.)
Why? Were the data collected to describe just those
individuals or to represent a larger group?
Types of Variables
A categorical variable places individuals into
one of several groups or categories.
A quantitative variable takes numerical values
for which arithmetic operations make sense
(usually recorded in a unit of measurement).

Most data tables follow this format—the data here appear in a spreadsheet program.
Exploratory Data Analysis
An exploratory data analysis is the process of
using statistical tools and ideas to examine data
in order to describe their main features.
EXPLORING DATA
Begin by examining each variable by itself.
Then move on to studying the relationships
among the variables.
Begin with a graph or graphs. Then add
numerical summaries of specific aspects of
the data.
Distribution of a Variable
To examine a single variable, we usually want to display its
distribution.

DISTRIBUTION OF A VARIABLE
The distribution of a variable tells us what values it takes and how
often it takes these values.

The values of a categorical variable are labels for the categories. The
distribution of a categorical variable lists the categories and gives
either the count or the percent of individuals who fall in each
category.
Displaying Categorical Data
The distribution of a categorical variable lists the
categories and gives either the count or the percent of
individuals who fall into each category.
Pie charts show the distribution of a categorical
variable as a “pie” where the sizes of the slices
reflect the counts or percents for the categories.
Bar graphs represent each category as a bar whose
height shows the category count or percent.
Pie Chart
EXAMPLE: What do the 1.5 million full-time first-year students
plan to study? Here are data on the percents of post-secondary
first-year students who plan to major in several discipline areas.
Field of Study Percent of Student PERCENT OF STUDENTS
Social Arts and
Biological sciences 15.5 sciences humanities

Business 13.8 Physical


sciences
Health professions 11.7
Biological
Engineering 11.5 sciences
Other
Social sciences 11 majors and
undeclared
Arts and humanities 8.8
Math and computer science 6.2
Education 4.4 Math and
Physical sciences 2.7 computer
Business
science
Other majors and undeclared 13.1
Total 98.7
Health
rounding error professions Education
Engineering
Pie Charts or Bar Graphs
EXAMPLE (cont’d): Here are data on the percents of post-
secondary first-year students who plan to major in several
discipline areas, now alphabetized by field of study.
18

Percent of students who plan to major


Field of Study Percent of Student 16

14
Biological sciences 15.5
12
Business 13.8
10
Health professions 11.7
8
Engineering 11.5
6
Social sciences 11
4
Arts and humanities 8.8
2
Math and computer science 6.2
0
Education 4.4

Physical sciences 2.7

Other majors and undeclared 13.1

Field of Study
Bar Graphs Only
EXAMPLE: What sources do Americans aged 12–34 years
use to keep up to date and50 learn about music?

Percent of 12-34 YO’s Who Have Listened to Each Brand


45
Brand Percent of 12-34s
Who Have
40
Listened to Each
Brand 35

Pandora 36
30

Spotify 46 25

iHeartRadio 14 20

Apple Music 20 15

10
Amazon Music 10
5
SoundCloud 23
0
Spotify Pandora SoundCloud Apple Music iHeartRadio Amazon Music Google Play All
Google Play All Access 8 Access

Audio Brand

Note: For bar graphs, percents don’t necessarily add to 100.


Quantitative Data
The distribution of a quantitative variable tells us
what values the variable takes on and how often it
takes on those values.
Histograms show the distribution of a
quantitative variable by using bars where the
height of each bar represents the number of
individuals who take on a value within a
particular class.
Stemplots separate each observation into a
stem and a leaf that are then plotted to display
the distribution, while maintaining the original
values of the variable.
Histograms (1 of 2)
Are appropriate for quantitative variables that
take on many values and/or for large datasets.
Divide the possible values into classes (equal
widths).
Count how many observations fall into each
interval (may change to percents).
Draw a picture representing the
distribution―bar heights are equivalent to the
number (percent) of observations in each
interval.
Histograms (2 of 2)
EXAMPLE: Freshman Graduation Rate, or FGR,
Data for 2016–2017
Interpreting Histograms
EXAMINING A HISTOGRAM
In any graph of data, look for the overall pattern
and for striking deviations from that pattern.
You can describe the overall pattern by its
shape, center, and variability. You will
sometimes see variability referred to as spread.
An important kind of deviation is an outlier, an
individual that falls outside the overall pattern.
Overall Shape of Distributions
SYMMETRIC AND SKEWED DISTRIBUTIONS
A distribution is symmetric if the right and left sides of the
graph are approximately mirror images of each other.
A distribution is skewed to the right (right-skewed) if the
right side of the graph (containing the half of the
observations with larger values) is much longer than the
left side.
A distribution is skewed to the left (left-skewed) if the left
side of the graph is much longer than the right side.

Symmetric Right-skewed Left-skewed


Stemplots (Stem-and-Leaf Plots) (1 of 2)
To make a stemplot:
1. Separate each observation into a stem (consisting of all
but the final, or rightmost, digit) and a leaf (the final
digit). Stems may have as many digits as needed, but
each leaf contains only a single digit.
2. Write the stems in a vertical column with the smallest at
the top, and draw a vertical line at the right of this
column. Be sure to include all the stems needed to span
the data, even when some stems have no leaves.
3. Write each leaf in the row to the right of its stem, in
increasing order out from the stem.
Stemplots (Stem-and-Leaf Plots) (2 of 2)
If there are very few stems (that is, when the data cover
only a very small range of values), then we may want to
create more stems by splitting the original stems.
EXAMPLE: If all the data values were between 150 and
179, then we might choose to use the following stems:

15
15 Leaves 0–4 would go on each upper
16
16
stem (first “15”), and leaves 5–9 would
17 go on each lower stem (second “15”).
17
Stemplots
• EXAMPLE: Percent of state population aged 18–34 who
are minorities

0 889
1 1556789
2 0223333336778
3 9
011145799
4 01112588
0 Key
5 1
1112234 3|9 means
Stems Leaves 6 177 39 percent
7 5 Stems = 10’s
Leaves = 1’s
Time Plots (1 of 3)
A time plot shows behavior over time.
Time is always on the horizontal axis, and the
variable being measured is on the vertical axis.
Look for an overall pattern (trend) and for deviations
from this trend. Connecting the data points by lines
may emphasize this trend.
Look for patterns that repeat at known regular
intervals (seasonal variations).
Time Plots (2 of 3)
Time Plot of Average Gauge Height,
Everglades National Park Monitoring Station
Time Plots (3 of 3)
Time Plot of the Mean Atmospheric CO2 Concentration (ppm)

You might also like