KEMBAR78
Chapter 3 Sampling & Estimation Theory | PDF | Sampling (Statistics) | Estimator
0% found this document useful (0 votes)
98 views52 pages

Chapter 3 Sampling & Estimation Theory

Chapter Three covers Sampling and Estimation Theory, detailing the importance of sampling as a method to gather data representative of a population. It discusses various sampling techniques, including probability and non-probability methods, along with their advantages, limitations, and the concept of sampling errors. The chapter also introduces estimation theory, explaining point and interval estimation, and how to construct confidence intervals for population parameters.

Uploaded by

henoksol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views52 pages

Chapter 3 Sampling & Estimation Theory

Chapter Three covers Sampling and Estimation Theory, detailing the importance of sampling as a method to gather data representative of a population. It discusses various sampling techniques, including probability and non-probability methods, along with their advantages, limitations, and the concept of sampling errors. The chapter also introduces estimation theory, explaining point and interval estimation, and how to construct confidence intervals for population parameters.

Uploaded by

henoksol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 52

Chapter Three

Sampling
&
Estimation Theory
Prepared By: Nahom M.
April 5,2011
Lecture Outline

• Introduction
• Advantages and disadvantages of sample survey
• Types of sampling techniques
• Sampling and non sampling error
• Examples and exercises
• Estimation theory
3.1 Introduction
 Sample is a representative of a population.
 It is a subset of the population which is selected
for further statistical investigation or analysis.
 Sampling is the process of selecting samples
from the population.
 A proper procedure should be adopted for
evaluating the sample plan in order to select
representative units of the population.
Advantages of sample survey
 Saves time, money, effort, and labor
 It is useful when population is infinitely large. If
taking census is practically impossible, sampling
will be the only option
 When data available is limited
 Minimize destruction
 It can be more accurately supervised and data
can be carefully selected
Limitations or draw backs of sample survey
 It would give unreliable data if not designed and
executed carefully which leads to sampling error
 Sample survey is not useful when information is
needed about each and every unit of the
population
A good sample possesses two characteristics, which are
i. Representativeness of the population and
ii. Adequate in Size (sample size)

 Sample size is the number of units in the sample


 Sample size should neither be too small nor too large
but optimum and proportional to the population size
The size of sample for a study is determined on the
basis of the following factors
 The size of the population
 The availability of resources
 The degree of accuracy
 The homogeneity or heterogeneity of the population
 The nature of the study
 The method of sampling technique adopted
 The nature of respondents
3.2. Types of Sampling Techniques
Sampling technique refers to the method of
selecting a sample from the population
Sampling method can be broadly classified into
two. these are:
1.Probability or Random Sampling
2.Non probability or Non Random sampling
2. Probability or Random Sampling
Random sampling method is a method of
selection of a sample such that each item
within the population has equal chance of
being selected
Types of Probability or Random sampling:
i. Simple Random sampling
ii. Stratified Random Sampling
iii. Systematic Random Sampling and
iv. Cluster Random Sampling
9
i. Simple Random sampling
This method involves very simple method of drawing a
sample from a given population.
Simple random sampling can be used when
- The population has been numbered, or can be
numbered at a low cost
- The sample elements are easily accessible, and
- The population is close to homogeneity
In this kind of sampling two methods are used for drawing
samples: (1) using ships of paper, if the population is
small and (2) using table of random numbers/random
number generator, if the population is large
ii. Stratified Random Sampling Method:
In this sampling technique the population is
divided in to non-overlapping relatively
homogeneous sub-groups called strata. Then
after stratification, samples will be selected
from each sub-groups or strata using simple
random sampling method.
It is used when the population under study is
heterogeneous
• The sample size from each strata is determined in
proportion with the stratum population
• To determine the size of sample items which must be
selected from the ith stratum population which is
denoted by ni ; we can use the following formula
nN i
ni 
N
Where n: Total Sample size
N: Total Population size
Ni : Sratum population size
ni : Stratum sample size
Example
Suppose we want to estimate the average salary of 5000
government employed teachers in AA, where 2000 of them are
certificate holders, 1500 are college diploma holders, 1000 are
first degree holders, 360 are second degree holders and 140
are third degree holders from a sample of 750 teachers, how
much is the stratum sample size?
Sol: n = 750, N = 5000, N1 = 2000, N2 = 1500, N3 = 1000,
N4 = 360, N5 = 140
N = N1 + N2 + N3 + N4 + N5 = 5000
Then,
nN1 750 x 2000 nN 2 750
n1   300 , n2   x 1500 225
N 5,000 N 5000
750 12 750
n3  x 1000 150 , n4  x 360  54 , n5  x 140  21
5000 5000 5000
iii. Systematic Random Sampling Method
This method of sampling is a common method of selecting a sample
when a complete list of the population is available, and the list is
arranged wholly at random which could be selected as follows:
• To get a systematic random sample of size n from a population of
size N, draw a random number i from 1 to K, where K is the
largest integer less or equal to , and then select i, i + K, i + 2K, i +
3K, …from the arranged list. K Nis called the constant of coding
(size of the interval for selection) n
• In general we can list the selected items for the inclusion of the
sample from the arranged list using the following formula
Ai = A1 + (i – 1) K
Where:- A1 is the random starting point or the first sample item and
Ai is the ith item in the sample
Example:
From the files of 32 cases of the federal high court, the cases of
only 5 of these is to be seen randomly. If the fourth file was the
first selected file using systematic random sampling, indicate the
remaining four elements of the sample.
Sol:- N = 32, n = 5, k = the largest integer less or equal to (N/n) = 6
A1 = 4 then the 1st selected file is the fourth element
A2 = A1 + (2 – 1) K
= A1 + K = 4 + 6 = 10. The 10th file is the second element
A3 = A1 + (3 – 1) K
= A1 + 2K = 4 + 2 (6) = 16. The 16th file is the third element.
A4 = A1 + (4 – 1) K
= A1 + 3K = 4 + 3 (6) = 22. The 22nd file is the fourth element.
A5 = A1 + (5 – 1) K
= A1 + 4K = 4 + 4 (6) = 28. The 28th file is the fifth element
Example 2:- If the 4th and 12th elements of a systematic sample are
70 and 126 (in the population) respectively, then which item of the
population is the first element of this systematic sample.
Sol:- A4 = 70, A12 = 126, A1 =?
A4 = 70 = A1 + 3K
A12 = 126 = A1 + 11K; solving these two equations
simultaneously
56 = 8K  K = 7
Then A4 = A1 + 3K
70 = A1 + 3(7)
 A1 = 70 – 21 = 49
The 49th item of the population in the arranged list is the random
starting point for the systematic sample
iv. Cluster Random Sampling
• In cluster random sampling, the total population is divided
into a number of relatively small non overlapping divisions
called clusters and some of these clusters can randomly be
selected for the inclusion of the over all sample. When
necessary each of these selected first stage sample units can
further be sub divided into a second stage units, and from
these subdivisions a sample is selected randomly. Further
stages may be taken if required
• In cluster random sampling, we can also choose (select)
members of those chosen subdivisions randomly instead of
taking all members of those chosen subdivisions.
Example:
Suppose one wants to estimate the average annual income of families
in AA, first subdivide AA into a number of sub divisions say 300
kebeles. These 300 kebeles can be considered as clusters. We can
randomly select some say 40 kebeles, and take either all members
of these selected kebeles or we may subdivide these selected
kebeles into smaller units say zones with in the kebeles and we
randomly select all members of selected zones or select the required
number of families from the selected zones. We may use also
stratified random sampling together with cluster random sampling
based on the profession, occupation, education and soon.
2. Non-Random (Non-Probability) Sampling Methods
These sampling techniques do not involve random
selection process because chance or probability is not
used to select items for the sample
There are many non-random sampling techniques. Some
of them are
i. Judgment
ii. Convenient and
iii.Quota sampling
i. Judgment Sampling
In Judgment sampling, personal judgment plays a
significant role in the selection of the sample due to
practical reasons. The exercise of good perception
and appropriate strategy is taken into account.
Samples are selected deliberately by the
investigator. It is a personal view. So it becomes
satisfactory with regards to one’s research needs.
Statistical investigator use Judgment Samples to
gain needed information. One important use of such
sampling is testing markets especially for new
products.
ii. Convenient Sampling
Convenient Sample is a Sample obtained by selecting
population units that are convenient to select for the
investigator with regard to saving time, money and
labor and obtaining required information. Elements of
the sample are selected by taking those elements of
the population, which are readily available or
convenient for the investigator
iii. Quota sampling
• In quota sampling, various segments of a population
have the same percentage of representation in the
sample as they have in the population. however, the
elements in the sample are not selected randomly,
rather based on judgment
• It is the combination of judgment and stratified
sampling methods. So it enjoys the merits of both
x

Sampling Error
Sampling error occurs when the sample statistic is not
representative of the population parameter. It simply
occurs when the samples represent only a portion of a
population. In such cases, the information contained in
the sample may lead to incorrect inferences about the
parent population.

Sampling errors may arise either from the chance factor


or from the use of biased sampling techniques.
Sampling errors can be reduced and/or eliminated by
using the appropriate sampling technique(s).
23
Non-sampling Errors
All errors other than sampling errors are non-
sampling errors. These include: missing data,
recording errors, input processing errors,
analysis errors, defective questionnaires,
response errors, non-response errors,
measurement errors, sampling from wrong
population, etc.
Exercises
1. A stratified sample is going to be selected from four
fields of study in Addis Ababa Institute of
Technology. The number of students in Chemical,
Mechanical, Electrical and Civil is 200, 360, 400
and 480 respectively. If the ratio of population size
to that of sample size is 40, how large a sample must
be taken from each of the four fields of study?
2. If the 3rd and 5th items of a systematic sample are
21 and 37 (in the population) and if there are 8
items in the sample, then a. Give the remaining
items in the sample. b. Find the total number of
items in the population.
3.3 Estimation theory
Estimation refers to any procedure where a sample information
is used to estimate or predict the value of a population
parameter
Parameter is a characteristic or measure obtained from a
population
Statistic is a characteristic or measure obtained from a sample
An Estimator is a sample statistic that is used in estimating a
population parameter.
An Estimate is the value determined from the estimator as an
estimate of the population parameter.
There are two ways of estimation
1) Point Estimation and
2) Interval Estimation
26
1. Point Estimation
A single-valued estimate.
A single element chosen from a sampling distribution.
Conveys little information about the actual value of the
population parameter and about the accuracy of the
estimate.
For example,_ a population mean () is estimated by a sample
mean (x) and population standard deviation (x) is
estimated by sample standard deviation (Sx)

27
2. Interval Estimation (Confidence Interval )
Point estimation produces a single value as an estimate of a
population parameter. The estimate may or may not be
close to the actual parameter value; thus, the estimate might
be incorrect.
• An interval estimate describes a range of values within
which a parameter might lie.
• An interval or range of values believed to include the
unknown population parameter.
• Associated with the interval there is a measure of
confidence that the interval does indeed contain the
parameter of interest.

28
• Because of these, interval estimation are more
desirable than point estimation.
• A confidence interval or interval estimate has two
components:
A range or interval of values
An associated level of confidence
3.4 Confidence Interval Estimation of a
Population Mean 

• A confidence interval estimate of  is an interval


estimate, together with the statement of how confident
(ex. 90%, 95%) we are that the interval is correct.
• Based on whether the population standard deviation 
is known or not we use different methods in
constructing a confidence interval for .
3.4.1 Confidence Interval Estimation of ,
when  is known
• The confidence interval estimate of  when  is known is given
by :  
 Z  X
x Z ; 
n

• Z is called a standard normal random variable with mean 0


2
and standard deviation 1; Z ̴ N(0,1 )
• The z value is determined based on the desired level of
confidence. 
Z
• The quantity n is often called the margin of error or the
sampling error

31

If the population distribution is normal, the sampling distribution
of the mean is normal.

In notation: [X~N()]
n
• If the sample size is sufficiently large, regardless of the shape of
the population distribution, the sampling distribution is normal
(Central Limit Theorem)
The normal distribution probability density function:
x
2 


 
N o r m a l D is trib uti o n:  = 0 ,  = 1 


0.4
f ( x) 1 e 2 2 for   x
0.3
2 2
e  2718281
. ... and   314159265
. ...
f(x)

0.2

0.1

0.0   x  Z     x  Z    ___( Z ?) 0.95


P
-5 0 5 P x  Z n   x  Z n ___( Z ?) 0.95
  n n 

  x  1.96     x  1.96    0.95


P
P x  1.96 n   x  1.96 n 0.95
 n n 
Normal Probabilities (Empirical
Rule)
• The probability that a normal
random variable will be within 1
standard deviation from its mean (on
either side) is 0.6826, or approximately
0.68.
• The probability that a normal
random variable will be within 2
standard deviations from its mean is
0.9544, or approximately 0.95.
• The probability that a normal
random variable will be within 3
standard deviation from its mean is
0.9974.
95% Intervals around the Sample
Mean
0.4
Sampling Distribution of the Mean
Approximately95%
Approximately 95%of ofthe
theintervals
intervals
95%
aroundthe
x 1.96  around thesample
samplemean
meancan
canbebe
n
0.3
expectedtotoinclude
expected includethe
theactual
actualvalue
valueof
ofthe
the
populationmean,
mean,.. (When
(Whenthethesample
sample
f(x)

population
0.2

0.1
2.5% 2.5%
meanfalls
mean fallswithin
withinthe
the95%
95%interval
intervalaround
around
0.0
thepopulation
the populationmean.)
mean.)
  x
  196
.   196
.
n n

x  1.96  x
x 1.96 
n n

**5%
x

x
5%of
ofsuch
suchintervals
intervalsaround
aroundthe
thesample
sample
x
meancan
mean canbebeexpected
expectednot
nottotoinclude
includethe
the
* x
x
actualvalue
actual valueof
ofthe
thepopulation
populationmean.
mean.
x (Whenthe
(When thesample
samplemean
meanfalls
fallsoutside
outsidethe
the
x
95%interval
95% intervalaround
aroundthe
thepopulation
population
x
x
x
x
* mean.)
mean.)
A (1-)100% Confidence Interval
for 

We define z as the z value that cuts off a right-tail area of under the standard
2 2
normal curve. (1-) is called the confidence coefficient.  is called the error
probability, and (1-)100% is called the confidence level.
S tand ard Norm al Distrib ution  
P  z  z  
0.4  2

(1   )
  
0.3 P z   z 
 2

 
f(z)

0.2
P z z z (1  )
   
0.1    2 2

2 2
0.0 (1- )100% Confidence Interval:
-5 -4 -3 -2 -1 0 1 2 3 4 5 
 z Z z x z
2 2
2 n
Critical Values of z and Levels of
Confidence
(1   )
 z
Stand ard N o rm al Distrib utio n

2 2
0.4
(1   )

0.99 0.005 2.576


0.3

f(z)
0.2

0.98 0.010 2.326 0.1  


2 2
0.95 0.025 1.960 0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5

0.90 0.050 1.645  z


2
Z z
2

0.80 0.100 1.282


The Level of Confidence and the
Width of the Confidence Interval
Whensampling
When samplingfrom
fromthe
thesame
samepopulation,
population,using
usingaafixed
fixedsample
samplesize,
size,the
the
higherthe
higher theconfidence
confidencelevel,
level,the
thewider
widerthe
theconfidence
confidenceinterval.
interval.
St an d ar d N or m al Di s tri b uti o n St an d ar d N or m al Di s tri b uti o n

0.4 0.4

0.3 0.3
f(z)

f(z)
0.2 0.2

0.1 0.1

0.0 0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
Z Z

80% Confidence Interval: 95% Confidence Interval:


 
x 128
. x 196
.
n n
The Sample Size and the Width of
the Confidence Interval
Whensampling
When samplingfrom
fromthe
thesame
samepopulation,
population,using
usingaafixed
fixedconfidence
confidence
level,the
level, thelarger
largerthe
thesample
samplesize,
size,n,
n,the
thenarrower
narrowerthetheconfidence
confidence
interval.
interval.
S a m p lin g D is trib utio n o f th e M e an S a m p lin g D is trib utio n o f th e M e an

0 .4 0 .9

0 .8

0 .3 0 .7

0 .6

0 .5

f(x)
f(x)

0 .2
0 .4

0 .3
0 .1
0 .2

0 .1
0 .0 0 .0

x x

95% Confidence Interval: n = 20 95% Confidence Interval: n = 40


Finding Probabilities of the Standard
Normal Distribution: P(0 < Z < 1.56)
Standard Normal Probabilities (Z - table)
Standard Normal Distribution z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.4 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.3 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
f(z)

0.2 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
0.1
1.56 1.0
1.1
0.3413
0.3643
0.3438
0.3665
0.3461
0.3686
0.3485
0.3708
0.3508
0.3729
0.3531
0.3749
0.3554
0.3770
0.3577
0.3790
0.3599
0.3810
0.3621
0.3830
{

1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
0.0 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
-5 -4 -3 -2 -1 0 1 2 3 4 5 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
Z 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

Look in row labeled 1.5 2.1


2.2
0.4821
0.4861
0.4826
0.4864
0.4830
0.4868
0.4834
0.4871
0.4838
0.4875
0.4842
0.4878
0.4846
0.4881
0.4850
0.4884
0.4854
0.4887
0.4857
0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
and column labeled .06 to 2.4
2.5
0.4918
0.4938
0.4920
0.4940
0.4922
0.4941
0.4925
0.4943
0.4927
0.4945
0.4929
0.4946
0.4931
0.4948
0.4932
0.4949
0.4934
0.4951
0.4936
0.4952

find P(0 z 1.56) = 2.6


2.7
0.4953
0.4965
0.4955
0.4966
0.4956
0.4967
0.4957
0.4968
0.4959
0.4969
0.4960
0.4970
0.4961
0.4971
0.4962
0.4972
0.4963
0.4973
0.4964
0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
0.4406 2.9
3.0
0.4981
0.4987
0.4982
0.4987
0.4982
0.4987
0.4983
0.4988
0.4984
0.4988
0.4984
0.4989
0.4985
0.4989
0.4985
0.4989
0.4986
0.4990
0.4986
0.4990
Example 1
A credit union wants to estimate the mean amount of outstanding
loans. Past experience reveals that the standard deviation is 250
birr. Determine a 98% confidence interval estimate for the mean
of- all outstanding loans (population mean) if a random sample of
100 outstanding loans has a sample mean of 1,950 birr.
Solution
x =1950,  = 250, n = 100, C. L = 98%, z = 2.33
n = 250 /100 = 250/10 = 25
The interval:

x Z
n = 1,950 + 2.33(25) = 1,950 + 58.25
The interval is then from 1,891.75 to 2,008.25.
Interpretation:
The credit union can say with 98% confidence that the mean
amount of outstanding loans is between birr 1,891.75 and
40
2,008.25
Example 2
Population consists
Population consists of
of the
the Fortune
Fortune 500
500 Companies
Companies
as ranked
as ranked byby Revenues.
Revenues. You You are
are trying
trying to
to find
find
out the
out the average
average Revenues
Revenues for for the
the companies
companies on on
the list.
the list. The
The population
population standard
standard deviation
deviation isis
$15,056.37. AA random
$15,056.37. random sample
sample of of 30
30 companies
companies
obtains aa sample
obtains sample mean
mean of of $10,672.87.
$10,672.87. Give Give aa
95% and
95% and 90%
90% confidence
confidence interval
interval for
for thethe
average Revenues.
average Revenues.
3.4.2 Confidence Interval or Interval Estimate
for  When  Is Unknown - The t Distribution

If the population standard deviation, , is not known, replace with the


sample standard deviation, s. The confidence interval estimate of  is given
by:
s X  
x t ;
t 
s
n n
The value of t can be read from the t distribution table at the degrees of
freedom (n - 1) and at the desired level of confidence
Cont’d
•• The
The tt distribution
distribution called
called Student’s
Student’s tt
distribution.ItItresembles
distribution. resemblesthe thestandard
standardnormal
normal
Standard normal
distribution Z:Z: itit has
distribution has aa bell-shaped,
bell-shaped,
symmetricaldistribution
symmetrical distributioncurve.
curve.
•• Thettdistribution
The distributioncurve
curveisisflatter
flatterand
andhas
hasfatter
fatter t, df = 20
tailsthan
tails thandoes
doesthethestandard
standardnormal.
normal.
t, df = 10
•• Themean
The meanofofaattdistribution
distributioncurve
curveisiszero.
zero.
•• Fordfdf>>2,2,the
For thevariance
varianceof of ttisisdf/(df-2).
df/(df-2). This
This
isis greater
greater than
than 1,1, but
but approaches
approaches 11 as as the
the
numberof
number ofdegrees
degreesof offreedom
freedomincreases.
increases.
•• The tt distribution
The distribution approaches
approaches the the standard
standard

normal as
normal as the
the number
number of of degree
degree of of freedom
freedom 
increases
increases
Cont’d
• The t distributions approach the standard normal distribution as
n increases.
• As a result, we can use the standard normal distribution (z
value table) when  is not known and n > 30 in constructing
an approximate interval estimate for 
• When n < 30 and  is not known t distribution table is used.

44
Cont’d

(1-)100%
AA(1- )100%confidence
confidenceinterval forwhen
intervalfor whenisisnot
notknown
known

s
x t
n

2

wheret isisthe
where thevalue
valueofofthe
thettdistribution
distributionwith
withn-1n-1degrees
degreesof
of
2

freedomthat
freedom thatcuts
cutsoff
offaatail
tailarea
areaof
of to toits
itsright.
right.
2
The t Distribution table
df t0.100 t0.050 t0.025 t0.010 t0.005
--- ----- ----- ------ ------ ------
1 3.078 6.314 12.706 31.821 63.657 t D is trib utio n: d f = 1 0
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841 0 .4
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707 0 .3
7 1.415 1.895 2.365 2.998 3.499 Area = 0.10 Area = 0.10
8 1.397 1.860 2.306 2.896 3.355

}
f(t)
0 .2
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
0 .1
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977 0 .0
15 1.341 1.753 2.131 2.602 2.947 -1.372 1.372
-2.228 0
16 1.337 1.746 2.120 2.583 2.921 2.228

}
17 1.333 1.740 2.110 2.567 2.898 t
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861 Area = 0.025 Area = 0.025
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807 Wheneverisisnot
Whenever notknown
known(and
(andthe
thepopulation
populationisis
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787 assumednormal),
assumed normal),thethecorrect
correctdistribution
distributiontotouse
useisis
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771 thet tdistribution
the distributionwith
withn-1
n-1degrees
degreesofoffreedom.
freedom.
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756 Note,however,
Note, however,that
thatfor
forlarge
largedegrees
degreesofoffreedom,
freedom,
30 1.310 1.697 2.042 2.457 2.750
40 1.303 1.684 2.021 2.423 2.704 thet tdistribution
the distributionisisapproximated
approximatedwellwellbybythe
theZZ
60 1.296 1.671 2.000 2.390 2.660
120 1.289 1.658 1.980 2.358 2.617 distribution.
distribution.
 1.282 1.645 1.960 2.326 2.576
Example 1
AAstock
stockmarket
marketanalyst
analystwants
wantstotoestimate
estimatethe
theaverage
averagereturn
returnononaacertain
certain
stock. AArandom
stock. randomsample
sampleofof15
15days
daysyields
yieldsananaverage
average(annualized)
(annualized)return
return
xof10.37%
of
andaastandard
and standarddeviation
deviationofofss==3.5%.
3.5%. Assuming
Assumingaanormal
normal
populationof
population ofreturns,
returns,give
giveaa95%
95%confidence
confidenceinterval
intervalfor
forthe
theaverage
averagereturn
return
ononthis
thisstock.
stock.

df
---
t0.100
-----
t0.050
-----
t0.025
------
t0.010
------
t0.005
------ The critical value of t for df = (n -1) = (15 -1)
1
.
3.078
.
6.314
.
12.706
.
31.821
.
63.657
. =14 and a right-tail area of 0.025 is:
. . . . . .
.
13
.
1.350
.
1.771
.
2.160
.
2.650
.
3.012
t 0.025 2.145
14 1.345 1.761 2.145 2.624 2.977 The corresponding confidence interval or
15 1.341 1.753 2.131 2.602 2.947 s
. . . . . .
interval estimate is: x t 0 . 025
.
.
.
.
.
.
.
.
.
.
.
. n
35
.
10.37 2.145
15
10.37 1.94
 8.43,12.31
Example 2
A random sample of 100 customer accounts at a large firm is selected
for the purpose of estimating the mean number of transactions per
year for each customer. The sample mean is 43 and the sample
standard deviation is 12. Determine a 90% confidence interval
estimate for .
Solution:- In this problem, as  is not known, the appropriate
distribution is t- distribution. However, since n > 30 we can
approximate
_
it by the standard normal distribution.
n= 100, x = 43, S = 12, C.L. = .90 , and z = 1.64
Sx = S / n = 12 / 100 = 1.2
The interval estimate is x + zSx = 43 + 1.64 (1.2) = 43 + 1.97
Thus, the interval is from 41.03 to 44.97
We are 90% confident that the mean number of transactions per year
per customer falls between 41.03 and 44.97. 48
Cont’d
Example 3
A quality control inspector of a Company selects frequent random
samples of size n=6 from the output of an automatic machine to
check on the average diameter  of parts being made. Diameters are
normally distributed. The sample has a mean diameter of 2.0016
inches and a standard deviation of 0.0012 inches. Construct the 99%
confidence interval for .
Solution:-In this problem x is not known and n < 30. Therefore, we use
_
t-distribution.
x = 2.0016, C.L. = 0.99,  = 1 - 0.99 = 0.01, df = n - 1 = 6-1 = 5,
S_ x = 0.0012 / 6 = 0.0004898, t /2,v = t 0.005,5 = 4.032
The interval is given by x + t/2,v S x = 2.0016 + (4.032 x 0.004898)
= 2.0016 + 0019748. This means that we are 99% confident that 
would fall between 1.9996252 and 2.0035748 49
Determination of Sample Size
• Collecting valid information through sampling requires
careful planning, including determination of an appropriate
sample size.
• How large should the sample size be? The answer depends
on the following three factors.
1. How precise (narrow) do we want a confidence interval estimate
to be?
2. How confident do we want to be that the interval estimate is
correct?
3. What is the standard deviation of the population in question?
• Generally the higher the desired precision or level of
confidence, the larger will be the sample size.
• And also, the larger the population variability is, the larger
will be the sample size.
50
Sample Size for Interval Estimation of 
• Consider
_
z = (x - )/( /n)
Solving this for n we get _the following
n =(z2 2) / (x - )2
This is the formula for computing sample size for interval
estimation of .
• There are three quantities that determine the value of n
– The value of z reflecting the confidence interval
– The absolute value of (x - ) which represents the maximum error
in estimation
– What is your estimate of the variance (or standard deviation) of
the population in question? When  is not known, Sx from a pilot
sample is used in its place.

51
Cont’d
Example:
For the purpose of illustration, assume the desired confidence
level is 95%. If =15 and we want an estimate of  with a
maximum error in estimation of 5, the required sample size
would be computed as follows.
Solution: n =(z2 2) / (x - )2
c.l = 0.95,  = 0.05, z = 1.96
x = 15
| x - | = 5
n = [(1.96)2 (15) 2 ] / 5 2
= 34.5744 or 35
In sample size determination, no matter what the value of the
decimal places is, we round them up wards. 52

You might also like