WEEK 11
Inferences about two population parameters
Olmos Isakov & Gulomjon Kosimjonov
WIUT 2023
Lecture outline
• Hypothesis testing for two means (𝜇" − 𝜇% )
• Hypothesis testing for two proportions (𝜋" − 𝜋% )
()*
• Hypothesis testing for two variances ( *)
(*
How to decide which one is better?
Comparing two population means
s s2 Population
Population 1
1 2
µ µ2
1
Select simple random Select simple random
,𝟏 sample, n2. Compute 𝐗,2
sample, n1. Compute 𝐗 ,𝟏- 𝐗
Compute 𝐗 ,2
Astronomical number Sampling
of X1 – X2 values Distribution
µ1 - µ2
Conditions Required for Valid Large-Sample Inferences about (μ1 – μ2)
1. The two samples are randomly selected in an independent manner from the two
target populations.
2. The sample sizes, n1 and n2, are both large (i.e., n1 ≥ 30 and n2 ≥ 30).
Due to the Central Limit Theorem, this condition guarantees that the sampling
distribution of (𝑥̅ 1 - 𝑥̅ 2) will be approximately normal regardless of the shapes of
the underlying probability distributions of the populations. Also, s21 and s22 will
provide good approximations to s21 and s22 when the samples are both large.
Note: If the sample sizes are smaller than 30, then population variances (or standard
deviations) have to be known. Otherwise one can use t-test instead of z-test, which will
be covered in the next semester.
Hypothesis about two means or proportions
Research Questions
Two-tailed test Left-tailed test Right-tailed test
Hypothesis Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2
𝜇" − 𝜇% = 0 𝜇" − 𝜇% = 0 𝜇" − 𝜇% = 0
H0 𝜋" − 𝜋% = 0 𝜋" − 𝜋% = 0 𝜋" − 𝜋% = 0
𝜇" − 𝜇% ≠ 0 𝜇" − 𝜇% < 0 𝜇" − 𝜇% > 0
Ha 𝜋" − 𝜋% ≠ 0 𝜋" − 𝜋% < 0 𝜋" − 𝜋% > 0
Test statistic
For 𝜇" − 𝜇% For 𝜋" − 𝜋%
(assuming large sample size):
(assuming unequal variance):
<) 6<*
z≅ ) )
= "6= ∗( 9 )
8) 8*
5̅ ) 6 5̅ *
z=
7*
) 7*
9 *
8) 8* here, P – pooled proportion estimator
5) 95*
P=
@) 9@*
5) 5*
p1 = and p2 =
@) @*
Example
A company conducts an advertising campaign to
increase the consumer awareness about its product.
Two surveys were conducted and the results of the two
random samples were as follows:
a) Test at 5% the claim whether there has been any
improvement in the product awareness after the
campaign.
b) Compute the p-value of the test.
Solution
a. Hypotheses (A –after, B - before):
H0 : 𝜋C − 𝜋D = 0 vs HA: 𝜋C − 𝜋D > 0
Sample proportions:
68 65
𝑝D = = 0.453 𝑝C = = 0.464
150 140
Common proportion:
NO9NP p-value
P = "PQ9"RQ = 0.459
Test statistic:
Q.RNR 6Q.RPT
z≅
) )
= 0.19 < z0.05 = 1.645
Q.RPU "6Q.RPU ∗( 9 )
)VW )XW
We fail to reject H0: there is no evidence of significant improvement in consume awareness.
b. p-value = P(z > 0.19) = 1 – 0.5753 = 0.4247 > α = 0.05
Comparing two population variances
This test is used to compare the variability of two populations (e.g. stock volatility of two
companies, measurement variability of two thermometers, etc.)
Test requirements:
1. The two sampled populations are normally distributed.
2. The samples are randomly and independently selected from their respective
populations.
Step 1. State your hypotheses:
()* ()*
Ho: =1 Ha: ≠1
(** (**
Step 2. Compute the test statistic:
𝒔𝟐𝟏
Fstat = (the ratio of two sample variances).
𝒔𝟐𝟐
Note: Insert the larger sample variance into the numerator when calculating Fstat.
Step 3. Formulate the decision rule: (We refer to F distribution table)
Reject H0 if Fstat ≥ 𝑭𝛂/𝟐, 𝐝𝐟𝟏, 𝐝𝐟𝟐 or p-value ≤ 𝛼
here, df1 = n1 - 1 (degrees of freedom for numerator)
df2 = n2 - 1 (degrees of freedom for denominator)
Step 4. Provide decision and interpretations.
Example
Comfort Limos offers limousine services from Government Center in downtown
to the Airport. The president of the company is considering two routes. One is
via U.S. 25 and the other via I-75. He wants to study the time (in minutes) it takes to drive
to the airport using each route and then compare the results.
Using the 0.10 significance level, is there a difference in the variation in the driving times
for the two routes?
Solution
Let’s denote US Route 25 as Population 1 and Interstate 75 as population 2.
Step 1. Hypotheses:
()* ()*
H 0: * =1 Ha: * ≠1
(* (*
Step 2. Find the test statistic (T.S.)
df1 = n1 – 1 = 7 – 1 = 6
df2 = n2 – 1 = 8 – 1 = 7
s12 = 80.9 and s22 = 19.1
OQ.U
Fstat = = 4.23
"U."
Solution
Step 3. We reject H0 if Fstat ≥ Fcrit
Fcritical = 3.87
df1 = 6
𝛼 = 0.10
𝛼/2 = 0.05
Step 4. Make a decision.
Since Fstat = 4.23 > Fcritical = 3.87, we reject H0.
There is sufficient evidence that the variation in the driving times for the two
routes is different
About F distribution
• F distribution is determined by two parameters: the degrees of freedom in the
numerator (df1) and the degrees of freedom in the denominator (df2).
• The F distribution is continuous.
• The F-statistic cannot be negative.
• The F distribution is positively skewed.
Reference
1. Lind et al (ISBN 978-1-260-18750-2), Chapter 11, 12.
2. McClave & Sincich (ISBN 978-0-321-75593-3), Chapter 9.
3. Newbold et al (ISBN 978-0-273-76706-0), Chapter 10.