Non Parametric Tests
04/05/2025 Hagazi Gebre 1
Purpose of sign test
Isa simple and versatile test that requires few
assumption.
Itis based on the binomial distribution. The test
involves simple counting the number of positive or
negative sign in sequence of any sign.
A common use to compare two sample of paired
data, as an alternative to the paired t-test using
only the sign of the difference.
04/05/2025 Hagazi Gebre 2
Sign Test
A common application of the sign test involves using a
sample of n potential customers to identify a
preference for one of two brands of a product.
The objective is to determine whether there is a
difference in preference between the two items being
compared.
Torecord the preference data, we use a plus sign if
the individual prefers one brand and a minus sign if
the individual prefers the other brand.
Because the data are recorded as plus and minus
signs, this test is called the sign test.
04/05/2025 Hagazi Gebre 3
Example:
Sign Test:
As part of a market research study, a sample of 36
consumers were asked to taste two brands of shoes
and indicate a preference. Do the data shown below
indicate a significant difference in the consumer
preferences for the two brands?
18 preferred brand A (+ sign recorded)
12 preferred brand B (_ sign recorded)
6 had no preference
The analysis is based on a sample size of 18+12= 30
04/05/2025 Hagazi Gebre 4
Example
Hypotheses
H0: No preference for one brand over the
other exists
Ha: A preference for one brand over the
other exists
Sampling Distribution Sampling distribution
of the number of “+”
values if there is no
brand preference
2.74
= 15 = 0.5(30)
04/05/2025 Hagazi Gebre 5
Rejection Rule
Using 0.05 level of significance,
Reject H0 if z < -1.96 or z > 1.96
TestStatistic
z = (18 - 15)/2.74 = 3/2.74 = 1.095
Conclusion
Do not reject H0.
There is insufficient evidence in the sample
to conclude that a difference in preference
exists for the two brands of shoes.
04/05/2025 Hagazi Gebre 6
Wilcoxon Signed-Rank Test
Thistest is the nonparametric alternative to the
parametric matched -sample test.
The methodology of the parametric matched-
sample analysis requires:
◦ interval data, and
◦ the assumption that the population of differences
between the pairs of observations is normally
distributed.
Ifthe assumption of normally distributed
differences is not appropriate, the Wilcoxon signed-
rank test can be used.
04/05/2025 Hagazi Gebre 7
Wilcoxon Signed-Rank Test
it is One example of non parametric test
TheWilcoxon Signed Rank Sum Test is used
when:
◦ Comparing two populations
◦ Data are continuous, but non-normal
◦ Paired samples
04/05/2025 Hagazi Gebre 8
Example: Express Deliveries
Wilcoxon Signed-Rank Test
A firm has decided to select one of two express
delivery services to provide next-day deliveries
to the district offices.
To test the delivery times of the two services,
the firm sends two reports to a sample of 10
district offices, with one report carried by one
service and the other report carried by the
second service.
Do the data (delivery times in hours) on the
next slide indicate a difference in the two
services?
04/05/2025 Hagazi Gebre 9
Example: Express Deliveries
District Office Service A Service B
Seattle 32 hrs. 25 hrs.
Los Angeles 30 24
Boston 19 15
Cleveland 16 15
New York 15 13
Houston 18 15
Atlanta 14 15
St. Louis 10 8
Milwaukee 7 9
Denver 16 11
04/05/2025 Hagazi Gebre 10
Wilcoxon Signed-Rank Test
Preliminary Steps of the Test
• Compute the differences between the paired observations.
• Discard any differences of zero.
• Rank the absolute value of the differences from lowest to highest.
Tied differences are assigned the average ranking of their
positions.
• Give the ranks the sign of the original difference in the data.
• Sum the signed ranks.
. . . next we will determine whether the sum is significantly
different from zero.
04/05/2025 Hagazi Gebre 11
Example: Express Deliveries
District Office Differ. |Diff.| Rank Sign. Rank
Seattle 7 10 +10
Los Angeles 6 9 +9
Boston 4 7 +7
Cleveland 1 1.5 +1.5
New York 2 4 +4
Houston 3 6 +6
Atlanta -1 1.5 -1.5
St. Louis 2 4 +4
Milwaukee -2 4 -4
Denver 5 8 +8
+44
04/05/2025 Hagazi Gebre 12
Example: Express Deliveries
Hypotheses
H0: The delivery times of the two services
are the same; neither offers faster service
than the other.
Ha: Delivery times differ between the two
services; recommend the one with the
smaller times.
Sampling distribution
Sampling Distribution
of T if populations
are identical
19.62
T = 0
04/05/2025 Hagazi Gebre T 13
Example: Express Deliveries
RejectionRule
Using 0.05 level of significance, Reject H0 if
z < -1.96 or z > 1.96
TestStatistic
z = (T - T )/T = (44 - 0)/19.62 = 2.24
Conclusion
Reject H0. There is sufficient evidence in the
sample to conclude that a difference exists in
the delivery times provided by the two
services. Recommend using the Service B.
04/05/2025 Hagazi Gebre 14
Mann-Whitney-Wilcoxon Test
Thistest is another nonparametric method for
determining whether there is a difference
between two populations.
This test, unlike the Wilcoxon signed-rank test, is
not based on a matched sample.
This test does not require interval data or the
assumption that both populations are normally
distributed.
The only requirement is that the measurement
scale for the data is at least ordinal.
04/05/2025 Hagazi Gebre 15
Mann-Whitney-Wilcoxon Test
Instead of testing for the difference between
the means of two populations, this method
tests to determine whether the two
populations are identical.
The hypotheses are:
H0: The two populations are identical
Ha: The two populations are not identical
04/05/2025 Hagazi Gebre 16
Example: Westin Freezers
Mann-Whitney-Wilcoxon Test (Large-Sample
Case)
Manufacturer labels indicate the annual
energy cost associated with operating home
piece of equipment such as freezers.
The energy costs for a sample of 10 Westin
freezers and a sample of 10 Brand-X Freezers
are shown on the next slide.
Do the data indicate, using a = 0.05, that a
difference exists in the annual energy costs
associated with the two brands of freezers?
04/05/2025 Hagazi Gebre 17
Example: Westin Freezers
Westin Freezers Brand-X Freezers
$55.10 $56.10
54.50 54.70
53.20 54.40
53.00 55.40
55.50 54.10
54.90 56.00
55.80 55.50
54.00 55.00
54.20 54.30
55.20 57.00
04/05/2025 Hagazi Gebre 18
Example: Westin Freezers
Mann-Whitney-Wilcoxon Test (Large-Sample Case)
◦ Hypotheses
H0: Annual energy costs for Westin freezers and
Brand-X freezers are the same.
Ha: Annual energy costs differ for the two brands of
freezers.
04/05/2025 Hagazi Gebre 19
Mann-Whitney-Wilcoxon Test:
First,rank the combined data from the lowest to
the highest values, with tied values being
assigned the average of the tied rankings.
Then, compute T, the sum of the ranks for the
first sample.
Then,
compare the observed value of T to the
sampling distribution of T for identical populations.
Thevalue of the standardized test statistic z will
provide the basis for deciding whether to reject H0.
04/05/2025 Hagazi Gebre 20
Mann-Whitney-Wilcoxon
Test of T for Identical
Sampling Distribution
Populations
◦ Mean
◦
mT = n1(n1 + n2 +1)
◦ Standard Deviation
T 112n1n2(n1 n2 1)
◦ Distribution Form
Approximately normal, provided
n1 > 10 and n2 > 10
04/05/2025 Hagazi Gebre 21
Example: Westin Freezers
Westin Freezers Rank Brand-X Freezers Rank
$55.10 12 $56.10
19
54.50 8 54.70 9
53.20 2 54.40 7
53.00 1 55.40 14
55.50 15.5 54.10 4
54.90 10 56.00 18
55.80 17 55.50 15.5
54.00 3 55.00 11
54.20 5 54.30 6
55.20 13 57.00 20
Sum of Ranks 86.5 Sum of Ranks
123.5
04/05/2025 Hagazi Gebre 22
Example: Westin Freezers
Mann-Whitney-Wilcoxon Test
◦ Sampling Distribution
Sampling distribution
of T if populations
are identical
13.23
T
T = 105 =1/2(10)(21)
04/05/2025 Hagazi Gebre 23
Example: Westin Freezers
Rejection Rule
Using 0.05 level of significance,
Reject H0 if z < -1.96 or z > 1.96
Test Statistic
z =(T -T )/T =(86.5 -105)/13.23= -1.40
Conclusion
Do not reject H0.
There is insufficient evidence in the sample
data to conclude that there is a difference
in the annual energy cost associated with
the two brands of freezers.
04/05/2025 Hagazi Gebre 24
Kruskal-Wallis Test
The
Mann-Whitney-Wilcoxon test can be used to test
whether two populations are identical.
The MWW test has been extended by Kruskal-Wallis
for cases of three or more populations.
TheKruskal-Wallis test can be used with ordinal data
as well as with interval or ratio data.
Also,
the Kruskal-Wallis test does not require the
assumption of normally distributed populations.
The hypotheses are:
H0: All populations are identical
Ha: Not all populations are identical
04/05/2025 Hagazi Gebre 25
Kruskal-Wallis test…
o The Kruskal-Wallis Test was developed by Kruskal
and Wallis (1952) jointly and is named so after them.
o It is the nonparametric alternative to the one-way
analysis of variance (one-way ANOVA) and is used
when the assumptions of ANOVA are not met.
04/05/2025 Hagazi Gebre 26
Kruskal-Wallis test…
o In the ANOVA, we assume that distribution of each group is
normally distributed and there is approximately equal
variance on the scores for each group.
o However, in the Kruskal-Wallis test, we do not have any of
these assumptions.
04/05/2025 Hagazi Gebre 27
Kruskal-Wallis test…
o It is used to compare the distributions of scores on a
quantitative variable obtained from two or more groups.
o Thus, it is applied in the same data situation as an ANOVA
for independent samples, except that it is used when the
data are either importantly non normally distributed, the
measurement scale of the dependent variable is ordinal (not
interval or ratio), or from a too-small sample.
04/05/2025 Hagazi Gebre 28
Kruskal-Wallis test…
o It is used when the populations from which the
samples are drawn are not normally distributed with
equal variances, or when the data for analysis consist
only of ranks
o The Kruskal-Wallis test is a rank-based approach for
three or more unpaired samples
04/05/2025 Hagazi Gebre 29