Chapter Goals
After completing this chapter, you should be
able to:
Engineering Statistics
Describe key data collection methods
Know key definitions:
Chapter 1
The Where
Where, Why
Why, and How of
Data Collection
Population
P
l ti vs. S
Sample
l
Primary
Pi
vs. S
Secondary
d
d
data
t ttypes
Qualitative vs. Quantitative data
Time Series vs. Cross-Sectional data
Explain the difference between descriptive and
inferential statistics
Describe different sampling methods.
1 of 25
2 of 25
Descriptive Statistics
Tools of Statistics
Collect data
Descriptive
p
statistics
e.g. Survey, Observation,
Collecting, presenting, and describing
data.
Experiments
Present data
Inferential statistics
e.g. Charts and graphs
Drawing conclusions and/or making
decisions concerning a population
based only
y on sample
p data.
Characterize data
e.g. Sample mean =
3 of 25
n
4 of 25
Data Sources
Survey Design Steps
Primary
Secondary
Data Collection
Data Compilation
Define the issue
what are the purpose and objectives of the
y
survey?
Print or Electronic
Observation
Define the population of interest
Formulate survey questions
Survey
make questions clear and unambiguous
use universally
universally-accepted
accepted definitions
E
Experimentation
i
t ti
limit the number of questions
5 of 25
Survey Design Steps
((continued))
6 of 25
Types of Questions
Closed-end Questions
Pre-test the survey
Select from a short list of defined choices
Example: Major: __business __engineering
__science
__
__other
__
pilot test with a small group of participants
assess clarity and length
Open-end Questions
Respondents are free to respond with any value, words, or
statement
Example: What did you like best about this course?
D
Determine
t
i th
the sample
l size
i and
d
p g method
sampling
Demographic Questions
Select sample and administer the
survey.
Questions about the respondents
respondents personal characteristics
Example: Gender: __Female __ Male
7 of 25
8 of 25
Populations and Samples
Population vs. Sample
A Population
p
is the set of all items or individuals
of interest
Examples:
a b
All likely voters in the next election
All parts produced today
All vehicles p
pass a certain road
Examples:
Sample
cd
ef gh i jk l m n
x y
1000 voters selected at random for interview
A few p
parts selected for destructive testing
g
The only passenger car that pass-by
gi
o p q rs t u v w
A Sample
p is a subset of the p
population
p
Population
n
r
9 of 25
Wh S
Why
Sample?
l ?
10 of 25
Sampling Techniques
Samples
Less time consuming than a census
Less costly to administer than a census
It is
i possible
ibl tto obtain
bt i statistical
t ti ti l results
lt
y high
g p
precision based on
of a sufficiently
samples.
Non-Probability
S
Samples
l
Judgement
Convenience
11 of 25
Probability Samples
Simple
Random
S t
Systematic
ti
Stratified
Cluster
12 of 25
Statistical Sampling
Simple Random Samples
Items of the sample
p are chosen based on known
or calculable probabilities
Selection may be with replacement or without
replacement
Probability Samples
Simple
Stratified
Systematic
Every individual or item from the population has
an equal chance of being selected
Samples can be obtained from a table of
random numbers or computer random number
generators.
Cluster
Random
13 of 25
Stratified Samples
14 of 25
Systematic Samples
Population divided into subgroups (called strata)
according to some common characteristics
Simple random sample selected from each
subgroup
Samples from subgroups are combined into one
one.
Decide on sample size: n
Di
Divide
id fframe off N individuals
i di id l iinto
t groups off k
individuals: k=N/n
Randomly select one individual from the 1st
group
Select every
y kth individual thereafter.
Population
Divided
into 4
strata
N = 64
n=8
First Group
k=8
Sample
15 of 25
16 of 25
Cluster Samples
Key Definitions
Population is divided into several clusters,
each representative of the population
A population is the entire collection of things
under consideration
A simple
p random sample
p of clusters is selected
All items in the selected clusters can be used, or items can be
chosen from a cluster using
g another p
probability
y sampling
p g
technique
A parameter is a summary measure computed to
describe a characteristic of the population
A sample is a portion of the population
selected for analysis
A statistic is a summary measure computed to
d
describe
ib a characteristic
h
i i off the
h sample
l
Population
p
divided into
16 clusters.
Randomly selected
clusters
l t
for
f sample
l
17 of 25
Inferential Statistics
Inferential Statistics
Drawing conclusions and/or making decisions
concerning a population based on sample results.
Making
g statements about a p
population
p
by
y
examining sample results
(known)
Estimation
e
e.g.:
g : Estimate the population mean
weight using the sample mean
weight
i ht
Population parameters
Sample statistics
Inference
(unknown but can
(unknown,
be estimated from
sample evidence)
S
Sample
l
18 of 25
Hypothesis
yp
Testing
g
e.g.: Use sample evidence to test
the claim that the population mean
weight is 120 pounds
Population
19 of 25
20 of 25
Data Types
Data Types
Data
Time Series Data
Ordered data values observed over time
Qualitative
(Categorical)
Quantitative
(Numerical)
Cross Section Data
Examples:
Marital Status
Political Party
Eye Color
(Defined categories)
Discrete
Examples:
D
Data
t values
l
observed
b
d att a fi
fixed
d point
i t iin
time
Continuous
Examples:
Number of Children
Defects per hour
(C
(Counted
t d it
items))
Weight
Voltage
(M
(Measured
d
characteristics)
21 of 25
Data Types
22 of 25
Chapter Summary
Reviewed key
y data collection methods
Road length (in km)
2003
2004
2005
2006
Kab. A
435
460
475
490
Kab. B
320
345
375
395
Kab. C
400
405
410
420
Kab. D
260
270
285
290
Time
Series
Data
Introduced key definitions:
Population
P
l ti vs. S
Sample
l
Primary
Pi
vs. S
Secondary
d
d
data
t ttypes
Qualitative vs. Quantitative data
Time Series vs. Cross-Sectional data
Examined descriptive vs. inferential statistics
Described different sampling techniques
Reviewed data types.
Cross Section
Data
23 of 25
24 of 25
Thank You
25 of 25