KEMBAR78
CH 6 - CH 8 (Notes) | PDF | Confidence Interval | Normal Distribution
0% found this document useful (0 votes)
131 views42 pages

CH 6 - CH 8 (Notes)

The document summarizes key concepts from chapters 6-8 of a statistics textbook. It introduces the normal distribution and discusses its properties including that it is symmetrical and bell-shaped. It explains how to calculate the percentage of observations that fall within a given range using the normal distribution table by converting data to z-scores. The document provides examples of finding areas under the normal curve, such as the area to the left or right of a given z-score, or between two z-scores. It also discusses how to find the z-score for a given area.

Uploaded by

Chilombo Kelvin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views42 pages

CH 6 - CH 8 (Notes)

The document summarizes key concepts from chapters 6-8 of a statistics textbook. It introduces the normal distribution and discusses its properties including that it is symmetrical and bell-shaped. It explains how to calculate the percentage of observations that fall within a given range using the normal distribution table by converting data to z-scores. The document provides examples of finding areas under the normal curve, such as the area to the left or right of a given z-score, or between two z-scores. It also discusses how to find the z-score for a given area.

Uploaded by

Chilombo Kelvin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 42

MA150

STATIS
TICS
NOTES
CHAPT
ER 6-
CHAPT
ER 8

Chapter 6
The Normal Distribution

1
CH6.1 Introducing Normally Distributed Variables

What is a distribution?

Distribution of a Data Set: The distribution of a data set is a table, graph, or formula
that provides the values of the observations and how often they occur.

We have seen distributions in the form of


 Frequency distribution
 Relative-frequency distribution
 Frequency histograms
 Relative-frequency histogram
 Dot-plots
 Stem-and-leaf diagrams
 Pie charts
 Bar graphs

For now, we consider relative frequency histograms.


One major aspect of a distribution is the shape.

THE NORMAL DISTRIBUTION


The utility of this distribution exceeds that of all others.

2
0.4
Relative Frequency

0.3

0.2

0.1

0.0

-4 -3 -2 -1 0 1 2 3 4
Normal data

0.4

0.3
mean=0,sigma=1

0.2

0.1

0.0

-4 -3 -2 -1 0 1 2 3 4
Normal Curve

A variable is said to be a normally distributed variable or to have a normal


3
distribution if its distribution has the shape of a normal curve.

Terminology:
 If a variable of a population is normally distributed and is the only variable
under consideration, we say that the population is normally distributed or that it
is a normally distributed population.

More commonly:
 If a variable’s distribution is shaped roughly like a normal curve, we say that
the variable is an approximately normally distributed variable or that it has
approximately a normal distribution.

The normal curve has 2 parameters, μ and σ . The mean of the normal distribution is μ
and its standard deviation is σ .

Each of the above normal curves has the same “shape”.

We see that each normal curve

 Is bell-shaped
 Is centered at μ.
 Close to the x-axis outside the range from μ−3 σ ¿ μ+3 σ .

Exercise: Guess μ(the mean) of the following distributions.

4
What’s your best guess for σ (the standard deviation)?

5
6
Key Fact: For a normally distributed variable, the percentage of all possible
observations that lie within any specified range equals the corresponding area
under its associated normal curve, expressed as a percentage. The result holds
approximately for a variable that is approximately normally distributed.

How do we find areas under the normal curve?


The areas are tabulated. See Table II.

For any normal distribution, the mean and standard deviation completely determine
the curve. To avoid needing a different table for each normal curve (i.e. each mean
and standard deviation), we standardize our normally distributed variable.

The standardized version (z) of a normally distributed variable (x),


x−μ
z=
σ ,
has a standard normal distribution with μ=0 ,σ =1 .

Goal: To learn to calculate probabilities of events concerning Normal random


variables.
For example: If last week’s test had scores normally distributed with a mean of 80 and
a standard deviation of 6, what percentage of scores were greater than 90?

7
Plan:
1. Learn to calculate specified areas under a Standard Normal Curve with
μ=0 , σ=1

2. To calculate specified areas under any normal curve, convert to z scores. Since the z
score has a Standard Normal Distribution, we can use the Normal Table to calculate
area and translate the information to the general normal variable.

Standard Normal Curve


Finding percentages for a normally distributed variable from areas under the standard
normal curve

8
CH 6.2- Areas Under the Standard Normal Curve
Basic Properties of the Standard Normal Curve

 The total area under the curve is equal to 1.


 The standard normal curve extends indefinitely in both directions, approaching,
but never touching, the horizontal axis as it does so.
 The standard normal curve is symmetric about zero, the mean. That is, the part
of the curve to the left of the zero point is the mirror image of that part of the
curve to the right of it.
 Almost all (99.7%) the area under the standard normal curve lies between -3
and +3.

Finding the Area to the left of a specified value

Example: To find the area under the standard normal curve that lies to the left of
1.23:

1. Draw a diagram as shown above.


2. Find the z score 1.2 in the Z column of Table II. Look for 1.2 in the left-hand
column and then move to the right until you encounter the column marked 0.03.
This is 1.23.
3. In the column, find the number 0.8907. This is the area under the curve less
than 1.23. We can say 89.07% of the area lies to the left of 1.23.

9
Finding the Area to the Right of a Specified Value

Example: To determine the area under the standard normal curve that lies to the
right of 0.76.

1. Look up in Table II the value of z = 0.76.


2. Look for 0.7 in the left-hand column and then move to the right until you
encounter the column marked 0.06. The tabled z value is 0.7764. Which means
the area to the left of 0.76 is 0.7764.
3. Subtract the area of 0.7764 from 1.0 to get 0.2236, the area to the right of
z=0.76. We can say that 22.36% of the area of the curve lies the right of z =
0.76.

Finding the Area between Two Specified Numbers

Example: Find the area between Z = – 0.51 and Z = 1.87.

We want

We start out with,


but it’s too much

We correct by

10
Example: Find the area between -0.51 and 1.87

The area to the left of z = 1.87 is 0.9693

The area to the left of z = -0.51 is 0.3050

The area between -0.51 and 1.87 is 0.9693 – 0.3050 = 0.6643

Summary: how to find areas given z-scores:

Direct
lookup

1-Direct
lookup

2 Direct
lookups and
a subtraction

? ? ?

Z Z Z1 Z2
A direct look up 1 – “direct 2 “direct lookups’
lookup” and subtract

11
Important Areas

  -z to +z Area under Curve Percentage of Total Area


between -z and +z

(a) -1 to +1 0.6826 68.26%

(b) -2 to +2 0.9544 95.44%

(c) -3 to +3 0.9974 99.74%

 Almost 100% (99.74%) of the curve lies between -3 and +3 standard deviations.

 Total area to the left of -3 and to right of +3 is equal to 0.0026 or 0.26%.

 The area to the left of -3 is equal to 0.0013 or 0.13% and the area to the right of
+3 is equal to 0.0013 or 0.13%.

Finding the z-value Having a Specified Area to Its Left.


12
Find the Z score such that the area to the left of the z-score is 0.7157

The z-score such that the area to the left of the z-score is 0.7157 is z=0.57

Example: To determine the z-value for which the area under the standard normal
curve to the left is 0.04.

1. Draw a diagram.
2. Look up in the body of Table II the value of 0.04. (We are looking for area).
Notice that the closest value in the table to 0.04 is 0.0401. Use this value.
3. Look in the column beside 0.0401 to find the corresponding z value. Here we
find that z = -1.75.

Example: To find the Z-score such that the area to the right of the z-score is 0.3021.

13
1. Find the area to the left of the z-score. It is 1-0.3021 = 0.6979.
2. Look up in the body of Table II to find the closest number to 0.6979, which is
0.6985.
3. Find the z-Score corresponding to this number. It is 0.52.

The Z α notation (the area to the right)

Z α is used to denote the z-score having the area of α to the right under the
standard normal curve.

Example: Find the value of z0.25

We are looking for the z-value such that the area to the right of the z-value is 0.25.
This means that the area left of the z-value is 0.75.

Exercise: Draw and find


Z 0 .05 .

14
Example: Find the z-scores that separate the middle 80% of the area under the normal
curve from the 20% in the tails.

z1 is the z-score such that the area to the left is 0.1, so z1 = -1.28.

z2 is its opposite. z2 = 1.28

Exercise: Find 2 z-scores that divide the area under the standard normal curve into a
middle 0.95 area and two outside 0.025 areas. Draw a picture first.

CH 6.3- Working with Normally Distributed Variables

15
Determining the Percentage or Probability for a Normally Distributed Variable

 For a general normal random variable X with mean μ and standard deviation σ,
X−μ
the variable Z ( Z= σ ) has a standard normal probability distribution.
 We can use this relationship to perform calculations for X
 Values of X àß Values of Z
x−μ
 If x is a value for X, then z= σ is a value for Z
 This is a very useful relationship. Because of this relationship,

P(X < x) = P(Z < z)

 To find P(X < x) for a general normal random variable, we could calculate

P(Z < z) for the standard normal random variable

 This relationship lets us compute all the different types of probabilities

X−μ
Probabilities for X are directly related to probabilities for Z using, Z= σ .

For example: if μ = 3 and σ = 2 then a value of x = 4 for X, corresponds to


4−3
z= =0 . 5
2 a value of z = 0.5 for Z.

Therefore, P(X<4)=P(Z<0.5).

To compute a percentage or probability for normally distributed variable:


16
 Sketch the normal curve associated with the variable,
 Shade the desired area to be computed for X
 Convert all values of X to Z-scores using

x−μ
z=
σ
 Use table II to solve the problem for the standard normal Z,
 The answer will be the same for the general normal X.

Exercise: For a general normal random variable X with

 μ=3
 σ=2

calculate P(X < 6)

Exercise: For a general normal random variable X with

 μ = –2
 σ=4

calculate P(X > –3)

Exercise: For a general normal random variable X with

17
 μ=6
 σ=4

calculate P(4 < X < 11)

Exercise: IQs are normally distributed with a mean of 100 and a standard deviation of
16. What percentage of people have IQs between 115 and 140?

18
The 68.26 – 95.44 – 99.74 rule.

Any normally distributed variable has the following properties.

1. 68.26% of all possible observations lie within one standard deviation to either
side of the mean, that is between μ−σ and μ+σ
2. 95.44% of all possible observations lie within two standard deviation to either

side of the mean, that is between μ−2 σ and μ+2 σ


3. 99.74% of all possible observations lie within three standard deviation to either

side of the mean, that is between μ−3 σ and μ+3 σ

Exercise: Consider IQs with a mean of 100 and a standard deviation of 16. Show the
68.26 – 95.44 – 99.74 rule for this variable.

Finding the observations for a specified percentage

To determine the observations corresponding to a specified Percentage or


Probability for a Normally Distributed Variable:

1. Sketch the normal curve associated with the variable.


2. Shade the region of interest.
3. Use Table II to obtain the z-scores bounding the region found in Step 2.
4. Obtain the x-values having the z-scores found in Step 3.

19
Exercise: Obtain the 90th percentile for IQs.

What is the 90th percentile?

1. Sketch.

2. Shade.

3. Table lookup to find z-scores.

4. Translate z-scores to x-scores.

Exercise: For a general normal random variable X with

 μ = –2
 σ=4
find the value x such that P(X > x) = 0.2

20
CH 6.4- Assessing Normality: Normal Probability Plots

 Many real world variables have bell shaped histograms, so we would say that
they should or could have normal probability distributions
 We need methods to assess whether this is a good assumption or not
 The main method used to assess whether sample data is approximately normal
is the normal probability plot.
 This plot graphs the observed data, ranked in ascending order, against the
“expected” Z-score of that rank
 The chart compares
o The lowest observed value with where it is expected to be (according to
the normal)
o The second lowest observed value with where it is expected to be
(according to the normal)
.
.
.
o The highest observed value with where it is expected to be (according to
the normal)
 The expected lowest value, the expected second lowest value, etc. are not easy
to derive
 Technology should be used to construct these graphs
 If the sample data was taken from a normal random variable, then this plot
should be approximately linear

Example: A normal probability plots of variables that look normally distributed.

21
Probability Plot of Other pain
Normal - 95% CI
99
Mean 4.44
StDev 2.128
95 N 25
AD 0.362
90
P-Value 0.416
80
70
Percent

60
50
40
30
20

10

1
0 3 6 9 12
Other pain

Probability Plot of Other Infection


Normal - 95% CI
99
Mean 3.96
StDev 3.323
95 N 25
AD 0.477
90
P-Value 0.216
80
70
Percent

60
50
40
30
20

10

1
-5 0 5 10 15
Other Infection

Example: Both of these show that this particular data set is far from having a normal
distribution

We can assess whether sample data is approximately normal by using the normal
probability plot.

22
Chapter 7
The Sampling Distribution of Sample Mean
CH 7.1-Distribution of the Sample Mean
Often the population is too large to perform a census … so we take a sample
 How do the results of the sample apply to the population?
 What’s the relationship between the sample mean and the population mean?
 What’s the relationship between the sample standard deviation and the
population standard deviation?
This is statistical inference

We want to use the sample mean x to estimate the population mean μ

Example: If we want to estimate the heights of eight year old girls, we can proceed as
follows:
 Randomly select 100 eight year old girls
 Compute the sample mean of the 100 heights
 Use that as our estimate
This is using the sample mean to estimate the population mean

However,
if we take a series of different random samples of size 100
 Sample 1 – we compute sample mean x1
 Sample 2 – we compute sample mean x 2
 Sample 3 – we compute sample mean x 3
 Etc.
Each time we take a sample, we may get a different result

The sample mean x is a random variable!

Because the sample mean is a random variable


 The sample mean has a mean
 The sample mean has a standard deviation
 The sample mean has a probability distribution
This is called the sampling distribution of the sample mean.

Let’s look at what this means:

23
http://opl.apa.org/contributions/Rice/rvls_sim/stat_sim/sampling_dist/index.html
CH 7.2-The Mean and Standard Deviation of the Sample Mean

Moral:
1. The center or mean of all the distribution remains the same.
2. As the sample size increases, the standard deviation of distribution of X
decreases.

Results:
Let x be a random variable with a mean μ and standard deviationσ .

 if we take a sample of size n, the sampling distribution of x has mean


μ x́ =μ

and
 standard deviation
σ
σ x́ =
√n

(this is referred to as the standard error of the mean or just the standard error)

CH 7.3: The Sampling Distribution of Sample Mean

GIVEN RESULT
Parent Distribution Distribution of the Sampling
Mean
Parameters
Mean Standard Sample Mean Standard Deviation
Deviation size μ x́ σ x́

µ 𝜎 n µ 𝜎/√ n

If x is normally distributed, then x is normally distributed with


σ
μ x́ =μ and σ x́ = .
√n
Central Limit Theorem (can you tell this is a BIG deal)
24
If x is not normally distributed and

 n ≥ 30 then x is approximately normally distributed with


σ
μ x́ =μ and σ x́ =
√n
 n < 30 then we cannot make a statement about the shape of the distribution

of x .

Chapter 7 results in table form


25
GIVEN RESULT

Parent distribution Sample Distribution of the Sample Mean


size
Shape Mean Std Dev n Shape Mean Std Dev
μ x́ σ x́
Normal µ 𝜎 All n Normal µ 𝜎/√ n
Not µ 𝜎 n<30 Undetermined µ 𝜎/√ n
Normal
Not µ 𝜎 n≥30 Approximately Normal µ 𝜎/√ n
Normal

Exercise: Suppose that the distribution of IQ scores is normally distributed with a


mean of 100 and a standard deviation of 16. A sample of size 4 is taken.
 Find the sampling distribution of x .

Parent distribution Sample Distribution of the Sampling Mean


size
Shape Mean Stnd n Shape Mean Stnd Dev
Dev μ x́ σ x́
Normal 100 16 4 ---- ---- ----

 Find the probability that x is less than 105.

Exercise:

26
Consider a random variable x that has a mean of 50 and a standard deviation of 8.
If a sample of size 64 is taken, what is the mean and standard deviation of x ?

 What is the shape of the distribution of x ?

Parent distribution Sample Distribution of the Sampling Mean


size
Shape Mean Stnd Dev n Shape Mean Stnd Dev
μ x́ σ x́

 Find the probability that x is greater than 52.

Exercise:
Consider a random variable x that has a mean of 50 and a standard deviation of 8.
If a sample of size 10 is taken:
 What is the mean and standard deviation of X ?

 What is the shape of the distribution of X ?

Parent distribution Sample Distribution of the Sampling Mean


size
Shape Mean Stnd n Shape Mean Stnd Dev
Dev μ x́ σ x́

 Find the probability that X is greater than 52.

27
Chapter 8
Confidence Intervals for One Population Mean
CH 8.1-Estimating a Population Mean

Recall the empirical, rule


For a standard normal variable with μ=0 ,σ =1
(-1, 1) has .6826 of the area
(-2, 2) has .9544 of the area
(-3, 3) has .9974 of the area

From chapter 7, we know that for large sample sizes (n ≥ 30) has anx
approximately normal distribution with a mean, μ and standard deviation,
σ
σ X=
√n
( μ -1
σX , μ +1
σX ) has .6826 of the area
( μ -2
σX , μ +2
σX ) has .9544 of the area
( μ -3
σX , μ +3 σX ) has .9974 of the area

A Point Estimate of a parameter is the value of a statistic used to estimate


the parameter.

For example a point estimate of μ is x .

A confidence interval is an estimate of intervals like the .6826, .9544, .9974 intervals.

For example, suppose we know that x is distributed with a normal distribution but the
mean, μ is unknown. The standard deviation is known to be 2. ( σ =2 )
A sample of size 4 is taken and the sample mean is calculated.

To estimate μ we use x .

28
σ 2
σ X= =
Standard deviation of x = √n √4 = 1

To estimate the intervals above, we can use x in the place of μ . We can


construct intervals like those above:

( x - 1 ∙1 , x + 1 ∙1) = ( x - 1, x +1) has a .6826 confidence level

( x - 2 ∙1, x + 2 ∙1) = ( x - 2, x +2) has a .9544 confidence level

( x - 3 ∙1 , x + 3 ∙1) = ( x - 3, x +3) has a .9974 confidence level

A confidence-interval estimate of a parameter consists of an interval of numbers


obtained from a point estimate of the parameter and a percentage that specifies how
confident we are that the parameter lies in the interval. The confidence percentage is
called the confidence level.

Example: Prices of New Mobile Homes (in $1000), n=36

46.8 47.4 38.2 35.9 28.3 51.9 28.9 35.5 36.9


42.9 41.2 34.6 51.9 50.2 38.1 43.3 43.0 49.4
41.6 46.1 52.4 42.7 34.9 30.3 32.7 35.0 42.8
36.7 45.7 40.7 34.5 55.7 55.8 39.6 53.5 56.9

Point estimate of μ : x = 42.28 (in $1000)

Given: σ = 7.2 (in $1000)


σ 7 .2
σ X= = =1 . 2
√ n √36
μ is unknown so we can estimate it with x
The 0.9544 confidence interval for μ is

( x - 2*1.2, x + 2*1.2) = ( x - 2.4, x + 2.4)

29
when x = 42.28,
(42.28 - 2.4, 42.28 + 2.4) = (39.88, 44.68)

If we took another sample, we would get a different value of x . The confidence


interval would shift just like the value of x .

Interpretation of a confidence interval:


1. We can be 95.44% confident that the population mean price of mobile homes is
somewhere between 39.88 and 44.68.

2. If we repeat the experiment a large number of times (many samples of size 36) and
each time we construct a 95.44% confidence interval, then we expect that 95.44% of
the time the confidence interval contains the population mean.

Exercise: Suppose x = 45.1, what would the 95.44 % confidence interval for μ
be?

Interpret the interval!

We can calculate confidence intervals for confidence levels equal to 68.26%, 95.44%,
or 99.74%. Now for any confidence level!

30
CH 8.2-Confidence Intervals for One Population Mean When σ is
known
Notation:
Confidence level = 1- α
So α = 1 – confidence level.

Exercise: Complete the table.

Confidence α
α Picture Zα
level 2 2

0.6826 0.3174 0.1587

0.90 0.1 0.05

31
0.95 0.05 0.025

The one-sample z-Interval Procedure for a Population Mean

Assumptions or conditions required to use this procedure:


1. Normal populations or large sample (n greater than 30). Need x-bar normally
distributed
2. σ known


α
Step 1 For a confidence level of 1 - , use Normal Tables II to find 2 .
Step 2 The confidence interval for μ is from
σ σ
x−z α⋅ x+ z α⋅
2 √n to 2 √n

where 2 is found in Step 1, n is the sample size, and x is computed from the
sample data.

Step 3 Interpret the confidence interval

Note: The C.I. is exact for normal populations and is approximately correct for large
samples from non-normal populations.

When can we use the z-interval procedure?

 n greater than or equal to 30 or if n less than 30 we need to check that the data
appears to be normal.

 When σ is known.

Fundamental Principle of Data Analysis

Always graph your data. Only use a procedure that is appropriate for your data.

32
Example: Consider the ages of 50 randomly selected people with a population
standard deviation of 12.1 years and a sample mean of 36.4 years; find the 95%
confidence interval for the population mean of their ages.

First question: Can we use a z-interval?


Consider n and σ .

Step 1 For a confidence level of 1 - α , use Table II to find 2 .
Confidence level = 95%


C.L. = .95  α =.05  α /2 = .025  2 =
Z 0 .025 =1.96

Step 2 The confidence interval for μ is from


σ σ
x−z α⋅ x+ z α⋅
2 √n to 2 √n
X = 36.4
n = 50
σ =12.1

2 =
Z 0 .025 =1.96
12. 1 12 .1
36 . 4−1. 96⋅ 36 . 4+1 . 96⋅
√ 50 to √50
36.4 – 3.35 to 36.4+3.35

Or

(33.0, 39.8)

Step 3:
We are 95% confident that the population mean falls in the interval (33.0, 39.8)

33
If we repeat the experiment a large number of times (many samples of size 50) and
each time we construct a 95% confidence interval, then we would expect that 95% of
the time the confidence interval contains the population mean.

Example: A sample of size 49 is taken from a class. The sample average height is 64
inches. The population standard deviation for heights is known to be 1.6. What is the
point estimate of μ. Calculate the 90% confidence interval for μ. Why is the technique
you used valid?

34
CH 8.3 - Margin of Error
Margin of Error for estimated μ is
σ
z α⋅
E= 2 √n
The confidence interval has the form
x−E to x+ E

We note that the Margin of Error is equal to ½ the length of the confidence interval.

Confidence Level and Interval Width:

For a fixed sample size, increasing the confidence level increases the width
of the interval, and vice-versa.


C.L. = .90  α =.10  α /2 = .05  2 =
Z 0 .05 = 1.645


C.L. = .99  α =.01  α /2 = .005  2 =
Z 0 .005 =

z α⋅
σ Zα
Since the margin of error is E= 2 √n . Increasing 2 increases the width of
the interval.

Sample Size and Interval Width:

For a fixed confidence level, increasing the sample size decreases the width of
the interval.

35
Suppose the C.L. = .99 and σ =1.5. Let’s see what happens to the margin of error
as n increases from 10 to 100.
σ
z α⋅
E= 2 √n (half the width of the interval)

= .99 margin
α =.01  αby/2changing
= .005 
Z
Example: C.L.
Comparing of errors 2 = size.
the sample 0 .005 =

For n = 10
1 .5 1 .5
2. 58⋅ 2. 58⋅ =1. 22
E= √ 10 = 3 .16

For n = 100
1 .5 1. 5
2. 58⋅ 2. 58⋅ =0 . 387
E= √ 100 = 10
So,

Increasing the sample size decreases the margin of error and hence decreases the
width of the interval.

When designing an experiment, we often have a required level of precision or margin


of error. We have to choose a sample size so that we end up with the small enough
margin of error. Because every data point cost money or time, we don’t want a sample
that is to large.

Given
 Known: σ

 Margin of error: E

 Confidence level: 1 - α

Find
 The sample size

36
σ
z α⋅
Since E= 2 √n
Solve n and we get

2
z α⋅σ
n= ( ) 2
E
Round up.
Example: Age of Civilian Labor Force
Determine the sample size required to ensure that we can be 95% confident that μ
is within 0.5 years of the estimate X . σ is known to be 12.2 years.

Given
 Known: σ = 12.2 years

 Margin of error: E = 0.5

 Confidence level: 1 - α = 0.95

Find
 The sample size

2
z α⋅σ
n= ( ) 2
E


C.L. = .95  α =.05  α /2 = .025  2 =
Z 0 .025
2
1. 96⋅12. 2
n= ( 0 .5 ) = 2287.13 or 2288 (always round up)

37
CH 8.4- Confidence Intervals for One Population Mean
When σ is Unknown.

x−μ
z=
When σ is known, the standardized version of x , σ /√n has a standard
normal distribution.

If σ is unknown, then what?

Estimate σ by the sample standard deviation, s, and calculate

x−μ
t=
s/ √n

Referred to as the Standardized version of x .

t does not have a normal distribution, it has a t-distribution with n-1 degrees of
freedom.

Basic Properties of the t-Curves

 The total area under a t-curve equals 1.

 A t-curve extends indefinitely in both directions, approaching but never


touching the horizontal axis
38
 A t-curve is symmetric about 0.

 As the number of degrees of freedom increases, the t-curve tends to the


standard normal.

http://www.bilkent.edu.tr/~ktarik/econ222/TDist.html

http://media.pearsoncmg.com/ph/esm/statistics_datasets/sullivan_funstats2e/sfs2e_tab
le.pdf

Example:
th
For 15 degrees of freedom, find the t 0. 05=95 percentile of the t-distribution.

th
t =99
For 28 degrees of freedom, find the 0. 01 percentile of the t-distribution.

Obtaining the confidence interval for μ when σ is not known.

Assumptions:
1. Normal population or large sample
2. σ unknown

Step 1: For a confidence level of 1- α , use Table V to find


t α /2 , with degrees of
freedom, df = n-1, where n is the sample size.

Step 2: The confidence interval for μ is from

s s
x−t α /2⋅ x+ t α /2⋅
√n to √n

39
where s and x are calculated from the sample data.

Step 3: Interpret the confidence interval.

Confidence Interval Example:


A sociologist develops a test to measure attitudes about public transportation, and 30
randomly selected subjects are given the test. Their mean score is 76.2 and their
standard deviation is 21.4. Construct the 95% confidence interval for the mean score
of all such subjects.

Is σ known? How do you know?


Do you have the necessary assumptions?
List your known quantities:
n = 30
x = 76.2
s = 21.4
Confidence level = 95

C.L. = .95  α =.05  α /2 = .025 (df = 29)


t.025 =
The confidence interval has the form
s s
x−t α /2⋅ x+ t α /2⋅
√n to √n
21. 4 21 . 4
76 .2−2. 045⋅ 76 .2+2 . 045⋅
√ 30 to √ 30
76.2 – 7.99 to 76.2 + 7.99

68.21 to 84.19

The 95% confidence interval is (68.2 to 84.2).

40
Interpretation: We are 95% certain that the population mean of the test scores is
between (68.2 to 84.2).

If we repeat the experiment a large number of times (many samples of size 30) and
each time we construct a 95% confidence interval, then we expect that 95% of the
time the confidence interval will contain the population mean.

Example: In 1908, W. S. Gosset published the article “The Probable Error of the
Mean” (Biometrika, Vol 6, pp. 1- 25). In this pioneering paper, written under the
pseudonym “Student,” Gossett introduced what later became known at Student’s t-
distribution. Gosset used the following data set, which gives the additional sleep
obtained by a sample of 10 patients using laevohysocyamine hydrobromide.

1.9 0.8 1.1 0.1 -0.1


4.4 5.5 1.6 4.6 3.4

Preliminary data analyses indicate that it is reasonable to assume the data was
generated by a normal process.

Probability Plot of C1
Normal - 95% CI
99
Mean 2.33
StDev 2.002
95 N 10
AD 0.357
90
P-Value 0.378
80
70
Percent

60
50
40
30
20

10

1
-5.0 -2.5 0.0 2.5 5.0 7.5 10.0
C1

Find a 95% confidence interval for the additional sleep that would be obtained on
average for all people using laevohysocyamine hydrobromide.

41
Was the drug effective in increasing sleep?

42

You might also like