Chapter 3 - Numerical Descriptive Measures
Chapter 3 - Numerical Descriptive Measures
Microsoft Excel
6th Global Edition
Chapter 3
X i
X1 X2 Xn
X i1
n n
Sample size Observed values
Copyright ©2011 Pearson Education 3-4
Measures of Central Tendency:
The Mean DCOVA
(continued)
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Mean = 13 Mean = 14
11 12 13 14 15 65 11 12 13 14 20 70
13 14
5 5 5 5
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Median = 13 Median = 13
n 1
Median position positionin the ordered data
2
If the number of values is odd, the median is the middle number
Note that n 1 is not the value of the median, only the position of
2
the median in the ranked data
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9
Copyright ©2011 Pearson Education 3-8
Measures of Central Tendency:
Review Example
DCOVA
House Prices: Mean: ($3,000,000/5)
$2,000,000 = $600,000
$ 500,000
$ 300,000
Median: middle value of ranked
$ 100,000 data
$ 100,000 = $300,000
Sum $ 3,000,000 Mode: most frequent value
= $100,000
XG (X1 X 2 X n ) 1/ n
Arithmetic
(.5) (1) Misleading result
mean rate X .25 25%
2
of return:
X i
XG ( X1 X2 Xn )1/ n
X i1
n Middle value Most Rate of
in the ordered frequently change of
array observed a variable
value over time
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
S 2 i1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Copyright ©2011 Pearson Education 3-18
Measures of Variation:
The Sample Standard Deviation
DCOVA
Most commonly used measure of variation
Shows variation about the mean
Is the square root of the variance
Has the same units as the original data
n
Sample standard deviation: (X X)
i
2
S i1
n -1
If the values are all the same (no variation), all these
measures will be zero.
S
CV 100%
X
Copyright ©2011 Pearson Education 3-25
Measures of Variation:
Comparing Coefficients of Variation
DCOVA
Stock A:
Average price last year = $50
Standard deviation = $5
S $5
CVA 100% 100% 10%
X $50 Both stocks
Stock B: have the same
standard
Average price last year = $100 deviation, but
stock B is less
Standard deviation = $5 variable relative
to its price
S $5
CVB 100% 100% 5%
X $100
Copyright ©2011 Pearson Education 3-26
Measures of Variation:
Comparing Coefficients of Variation
(continued)
Stock A:
DCOVA
Average price last year = $50
Standard deviation = $5
S $5
CVA 100% 100% 10%
X $50 Stock C has a
Stock C: much smaller
standard
Average price last year = $8 deviation but a
much higher
Standard deviation = $2 coefficient of
variation
S $2
CVC 100% 100% 25%
X $8
Copyright ©2011 Pearson Education 3-27
Locating Extreme Outliers:
Z-Score
DCOVA
To compute the Z-score of a data value, subtract the
mean and divide by the standard deviation.
Skewness
Statistic < 0 0 >0
Copyright ©2011 Pearson Education 3-32
Shape of a Distribution
(Kurtosis)
DCOVA
Kurtosis
Statistic < 0 0 >0
Copyright ©2011 Pearson Education 3-33
General Descriptive Stats Using
Microsoft Excel Functions DCOVA
3. Select Descriptive
Statistics and click OK.
$2,000,000
500,000
300,000
100,000
100,000
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% of the observations
are smaller and 50% are larger)
Only 25% of the observations are greater than the third
quartile
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data,
so Q1 = (12+13)/2 = 12.5
Measures like Q1, Q3, and IQR that are not influenced
by outliers are called resistant measures
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
> ≈ <
> ≈ <
Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
Xsmallest Q1 Q2 Q3 Xlargest
0 2 2 2 3 3 4 5 5 9 27
00 22 33 55 27
27
X i
X1 X2 XN
i1
N N
Where μ = population mean
N = population size
Xi = ith value of the variable X
Copyright ©2011 Pearson Education 3-52
Numerical Descriptive Measures
For A Population: The Variance σ2
DCOVA
Average of squared deviations of values from
the mean
N
Population variance: (X μ)
i
2
σ2 i1
N
N
Population standard deviation:
i
(X μ) 2
σ i1
N
68%
μ
μ 1σ
Copyright ©2011 Pearson Education 3-56
The Empirical Rule
Approximately 95% of the data in a bell-shaped
DCOVA
distribution lies within two standard deviations of the
mean, or µ ± 2σ
95% 99.7%
μ 2σ μ 3σ
Copyright ©2011 Pearson Education 3-57
Using the Empirical Rule
DCOVA
Suppose that the variable Math SAT scores is bell-
shaped with a mean of 500 and a standard deviation
of 90. Then,
68% of all test takers scored between 410 and 590
(500 ± 90).
At least within
( X X)( Y Y)
i i
cov ( X , Y ) i1
n 1
Only concerned with the strength of the relationship
No causal effect is implied
Copyright ©2011 Pearson Education 3-60
Interpreting Covariance
DCOVA
Covariance between two variables:
cov(X,Y) > 0 X and Y tend to move in the same direction
cov(X,Y) < 0 X and Y tend to move in opposite directions
cov (X , Y)
r
SX SY
where
n
(X X)(Y Y)
n n
i i (X X)
i
2
(Y Y)
i
2
cov (X , Y) i1
SX i1
SY i1
n 1 n 1 n 1
Copyright ©2011 Pearson Education 3-62
Features of the
Coefficient of Correlation
DCOVA
The population coefficient of correlation is referred as ρ.
The sample coefficient of correlation is referred to as r.
Either ρ or r have the following features:
Unit free
Ranges between –1 and 1
The closer to –1, the stronger the negative linear relationship
The closer to 1, the stronger the positive linear relationship
The closer to 0, the weaker the linear relationship
X X
r = -1 r = -.6
Y
Y Y
X X X
r = +1 r = +.3 r=0
Copyright ©2011 Pearson Education 3-64
The Coefficient of Correlation Using
Microsoft Excel Function
DCOVA
r = .733
Scatter Plot of Test Scores
100
There is a relatively 95
Test #2 Score
strong positive linear 90
#2. 75
70
70 75 80 85 90 95 100
Test #1 Score
Students who scored high
on the first test tended to
score high on second test.