Statistical Test Methods For Hypothesis Testing
Statistical Test Methods For Hypothesis Testing
ISSN 2229-5518
Abstract - An outlier is an observations which deviates or far away from the rest of data. There are two kinds of outlier methods, tests
discordance and labeling methods. In this paper, we have considered the medical diagnosis data set finding outlier with discordancy test
and comparing the performance of outlier detection. Most of the outlier detection methods considered as extreme value is an outlier. In
some cases of outlier detection methods no need to use statistical table. The suggested outlier detection methods using the context of
detection sensitivity and difficulties of analyzing performance for outlier detections are compared.
Index Terms — Discordance test, Dixon, Generalized ESD, Grubbs, Hampel, Outlier Detection
—————————— ——————————
1 INTRODUCTION The fig-1 represents the three points 81.5, 79.5, and 78.8
are far away from the data set. In the three values are
IJSER
marketing, network intrusion detection, weather
prediction, pharmaceutical research and exploration in
Inherent variability:
science databases require the detection of outliers.
This is the expression of the way in which observations
intrinsically vary over the population; such variation is a
Barnett and Lewis (1978) defined as in a sample of
natural feature of the population and uncontrollable. Thus,
moderate size taken from a certain population it appears
for example, measurements of heights of men will reflect
that one or two values are surprisingly far away from the
the amount of variability indigenous to that population.
main group. D.M. Hawkins (1980) gives definition to
outlier as: An outlier is an observation, which so much
Measurement error:
deviates from other observations as to arouse suspicions
Often we must take measurements on members of a
that it was generated by a different mechanism. Example,
population under study. Inadequacies in the measuring
dataset from Laurie Davies (1993)
instrument superimpose a further degree of variability on
9.1, 79.5, 26.8, 81.5, 19.1, 15.2, 22.6, 28.8, 24.1, 23.6,
the inherent factor. The rounding of obtaining values, or
18.6, 17.3, 25.8, 78.8, 23.1, 11.9, 20.1, 20.3, 14.1, 26.5
mistakes in recording, compound the measurement error:
outlier detection they are part of it. Some control of this type of variability is
possible.
80
Execution error:
60
5 10 15 20
Treatment of Outliers
Index The various outlier methods are using to test and
compared in this paper. Recently, most of people affected
Fig - 1. Scatter plot for outlier detection
by the blood pressure. They have to resort to the hospital to
check their health conditions. The treatments cannot cure in
———————————————— single day. They need every time after consumption of
• K. Manoj, Research Scholar, Department of Statistics drugs, blood pressure is checking by physician. Sometimes
Manonmaniam Sundaranar University, Tamil Nadu, India,
E-mail: manojstatms@gmail.com
measuring the blood pressure referred to false
• K. Senthamarai Kannan, Professor, Department of Statistics measurements. It may be negligence of the physician or the
Manonmaniam Sundaranar University, Tamil Nadu, India, measuring error instrumented. It is not a valid measure of
E-mail: senkannan2002@gmail.com treatment. In this situation using outlier detection method
IJSER © 2013
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 4, Issue 9, September-2013 710
ISSN 2229-5518
is very useful to find the right treatment. location of outlying observations that there might be
several ways of approaching the problem, which depended
to a large extent on the object in view. One might, for in
2 RELATED WORK
stance, be primarily interested in pruning the
The previous studies using outliers methods to find the observations in order to secure a more accurate analysis
different methodologies and results. Armin of what was left, example to obtain the most reliable
Bohrer (2008) proposed method for using Dixon’s outlier estimate of a mean. Or one might be particularly
test has been calculated using Monte Carlo simulation one interested in identifying the genuinely exceptional
sided two-sided case critical values are determined. observations, in order to a new insight into the phenomena
Barbato G. et.al (2011) discussed about a several statistical under study. In the first case the criterion of what was best
methods that are currently in use for outlier identification might be the effect on the standard error of estimation, in
and their performance are compared theoretically for the second case the risk of wrongly deciding whether an
typical statistical distributions of experimental data and observation was exceptional or not. The procedures
considering values derived from the distribution of extreme discussed in the following paper start from the basis of
order statistics as reference terms. risks of misclassification rather than of estimation errors.
Grubbs (1969) describes the procedures are given for McMillan (1971) describes performances of three
determining statistically whether the highest observation, procedures for treatments of outliers in normal samples are
the highest and lowest observations, the two highest evaluated. The first procedure is the continuous application
observations, the lowest observations, or more of the of the usual maximum residual test. The largest value is an
observation in the sample are statistical outliers. outlier if the largest studentized residual exceeds a already
Khrominski (2010) using various methods of outlier determined value. If one outlier is detected, the test is
detection in medical diagnoses. They discussed repeated on the remaining observations, in the process to
IJSER
investigated the usefulness of selected outlier detection continue until no further outliers are detected. The second
methods in the context of detection speed and performance procedure is two largest observations are declared to
analysis and the difficulty of automating the performance outliers if the sum of two largest studentized residuals
analysis by using the test methods for outlier detection. exceeds a predetermined value. In the third procedure of
the two largest values are considered outliers if the ratio of
Thomas et. al., (1988) describes the outlier test procedure the corrected sum of squares omitting these values to the
was found to influence the interlabaratory standard total corrected sum of squares is less than a critical ratio.
deviations (SDs), but not the averages. It was shown that The procedure performances are evaluated for samples in
even small number of differences in the numbers of outliers which two of the values have means different from the
detected can change the SD severely. Comparing the common mean of the remainder of the sample.
outliers test procedures for Hampel, Grubbs and Graf-
Henning, it was found that Hampel test detected the most Tietjen and Moore (1972) are described problems of
outliers. Tietjen (1973) proposed a procedure of repeated application and "masking". They suggested as
studentizing or standardizing the residuals by dividing appropriate to over-come these problems are two new
them by their estimated standard deviations is proposed statistics: L k which is based on the k largest (observed)
for testing for outliers in simple linear regression. values and E k which is based on the k largest (in absolute
value) residuals. Jacqueline and Hawkins (1981) proposed
Paul and Fung (1991) are concerned with describes the method for accurate bounds a represented for the fractiles
procedures for detecting multiple y outliers in the linear of the maximum normed residual (which is often used to
regression. The generalized extreme studentized residual test for a single outlier) for two way and three way layouts
(GESR) procedure, controls which type I error rate, is and its shown that the second Bonferroni bound of the
developed and approximate formula to calculate the critical value is an excellent approximation of the critical
percentile is given for large samples and more accurate value being much more accurate the first B on ferroni upper
percentiles for n ≤ 25 are tabulated. The procedure bound. The third Bonferroni (upper) bound is expensive to
performance is compared with others by Monte Carlo compute and agrees with the second bound to at least four
techniques and found to be superior. However, the decimal places for all factor combinations considered.
procedure fails in detecting y outliers that are on high-
leverage cases. They suggest a two-phase procedure. The Laurie Davies and Ursula Gather (1993) approach to
phase- 1 a set of suspect observations is identified by GESR identifying outliers is to assume that the outliers have a
and one of the diagnostics applied sequentially and different distribution from the remaining observations.
phase- 2 a backward testing is conducted using the GESR They define outliers in terms of their position relative to the
procedure to see which of the suspect cases are outliers. model for the good observations. The identification outlier
They analyzed several examples in this paper. problem is then the problem of identifying those
observations that lie in a so-called outlier region. A more
Quesenberry (1961) discussed on the rejection and detailed analysis shows that methods based on robust
IJSER © 2013
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 4, Issue 9, September-2013 711
ISSN 2229-5518
IJSER
find the outliers in the data set. In these methods final test
description discussed some conditions under which a parameter R.
decision whether checking data is an outlier or not is made. The test has various test statistics. Suppose for testing
large set of element to be an outlier, the sample arranged in
There are two kinds of outlier methods, Formal Method ascending order X1 ≤ X2 ≤. … ≤ Xn Implying that the large
and Informal Method. It is usually called, ‘Tests of sample element is given by Xn. Dixon proposed the
Discordance’ and ‘Labeling Methods’ respectively. A following test statistics defined as
detection test procedure must need to a statistical test, 𝑥𝑛 − 𝑥𝑛−1
termed here a test of discordance. They are usually based 𝑅10 = , 𝑓𝑓𝑓 3 ≤ 𝑛 ≤ 7
on assuming some well-behaving distribution, and test if 𝑥𝑛 − 𝑥1
𝑥𝑛 − 𝑥𝑛−1
the target of extreme value point is an outlier in the 𝑅11 = , 𝑓𝑓𝑓 8 ≤ 𝑛 ≤ 10
distribution. 𝑥𝑛 − 𝑥2
𝑥𝑛 − 𝑥𝑛−2
𝑅21 = , 𝑓𝑓𝑓 11 ≤ 𝑛 ≤ 13
3.1 Grubbs Test 𝑥𝑛 − 𝑥2
𝑥𝑛 − 𝑥𝑛−2
Grubbs (1969) used to detect a single outlier in a 𝑅22 = , 𝑓𝑓𝑓 14 ≤ 𝑛 ≤ 30
𝑥𝑛 − 𝑥3
univariate data set. The data set that follows an
For testing the smallest sample element to be an outlier,
approximately normal distribution. Grubbs' test is defined as
the sample is ordered in descending order implying that
the following two hypotheses:
the smallest sample element is labeled 𝑋𝑛 . All the selection
H0: There is no outlier in the data set
of the test statistics depends on the Dixon’s criteria.
H1: There is at least single outlier in the data set
The general formula for Grubbs' test statistic is defined as:
The variable 𝑋𝑛 is marked as an outlier, when the
max Yi − Y corresponding statistic 𝑅(𝑛) exceeds a critical value, which
G= depends on the selected significance level 𝛼.
s
Where 𝑦𝑖 is the element of the data set, Y and s
denoting the sample mean and standard deviation and the The calculated value of the parameter R is compared
test statistic is the largest absolute deviation from the with the Dixon’s test critical value for choosing statistical
sample mean in units of the sample standard deviation. The significance. When the calculated value of parameter R is
calculated value of parameter G is compared with the bigger than the critical value then it is possible to accept
critical value for Grubb’s test. When the calculated value data from the data set as an outlier.
higher or lower than the critical value of choosing statistical
3.4 Hampel Method
significance, then the calculated value can be accepted as
and outlier. The statistical significance (𝛼) describes the To calculate Hampel’s test statistical tables are not
maximum mistake level which a person searching for necessary. Theoretically, this method is resistant, which
outlier can accept. means that it is not sensitive to outliers, it also has no
restrictions as to the abundance of the data set.
IJSER © 2013
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 4, Issue 9, September-2013 712
ISSN 2229-5518
𝛼
Hampel’s test performs the steps for data sets are as 𝑝 =1−
2(𝑛−𝑖+1)
follows:
i. Compute the median (Me) for the total data set. The
Number of outliers is determined by finding the largest I
median is described as the numeric value and
such that I > λi. Simulation studies by Rosner (1983) indicate
separating the higher half of a data set from the lower
that this critical value approximation is very accurate
half.
for 𝑛 ≥ 25. It is used to test with higher number of outliers
ii. Compute the value of the deviation 𝑟𝑖 from the
than expected when testing for outliers among data coming
median value; this calculation should be done for all
from a normal distribution.
elements from the data set:
𝒓𝒊 = (𝒙𝒊 − 𝑴𝑴)
where, 𝑥 − simple data from the data set,
𝑖 − belongs to the set for 1 to n. 4 Results and Discussion
𝑛 − number of all element of the set Normal Probability Plot of Blood Pr
𝑀𝑀 − median
iii. Calculate the median for deviation 𝑀𝑀|𝑟𝑖|
80
iv. Check the conditions: |𝑟𝑖 | ≥ 4.5𝑀𝑀|𝑟𝑖 |
If the condition is executed, then the value from the data
Sample Quantiles
set can be accepted as an outlier.
60
3.5 Generalized ESD Test for Outliers
40
Rosner (1983) used in the generalized (extreme
Studentized deviate) ESD test to detect one or more outliers
in a univariate data set that follows an approximately normal
IJSER
20
distribution.
specified.
Fig - 2. Normal probability plot for outlier detection
Given the upper bound, r, the generalized ESD test
In this experiment, we use blood pressure reduction in
essentially performs r separate tests: a test for single outlier, a
after taking the drug reading data. The data were collected
test for two outliers, and so on up to r outliers. The
from Tirunelveli Government health center. For the test
generalized ESD test is defined for the hypothesis:
purpose we take only 30 samples from the data set.
H 0 : There is no outlier found in the data set
The normal probability plot fig. 1 representing the data
H a : There are up to r outliers in the data set
with outlier value deviates from the original data. The plot
Test Statistic: Compute
indicates the outliers point far away from samples. The fig. 2
max𝑖 |𝑥𝑖 −𝑥̅ | shows that the outlier values removed by using outlier
𝑅𝑖 = detection methods and it follow as a normally distributed.
𝑠
Normal Probability Plot of Blood Pr
Significance Level: 𝛼
15
(𝑛−𝑖)𝑡𝑝,𝑛−𝑖−1
𝜆𝑖 =
2 )(𝑛−𝑖+1)
�(𝑛−𝑖−1+𝑡𝑝,𝑛−𝑖−1
5
IJSER
Dixon Test Critical 1 1 1 3 outlying Observations in Samples. American
value Statistical Association and American Society for
Hampel 2 0 0 2
0.5% Quality. Technometrics, Vol. 11. No. pp. 1-21.
Quartile Method 2 0 0 2 8. Hawkins D. M. (1980), Identification of Outliers,
Generalized ESD 2 0 0 2 Chapman & Hall, London.
9. Jacqueline S. Galpin and Douglas M. Hawkins
The other three outlier methods strongly detect outliers in (1981). Rejection of a Single Outlier in Two- or
a single experiment. The major outlier is finding easy and Three-Way Layouts, Technometrics, Vol. 23, No. 1,
quick in the experiments. In these experiments no need pp. 65-70.
critical value for Hampel and Quartile methods and other 10. Laurie Davies and Ursula Gather (1993).The
tests must needed for critical value to detect the outliers. Identification of Multiple Outliers. Journal of the
American Statistical Association, Vol. 88, No. 423,
The R software tested the experimental purpose of the pp. 782- 792.
tested methods used for R scripts. Lukasz Komsta (2006) is 11. Lukasz Komsta (2006). Processing data for outlier: R
used for example for the R codes for Dixon, Generalized ESD News, Vol 6/2.
Test and Grubb’s tests. 12. McMillan R. G. (1971). Tests for One or Two
Outliers in Normal Samples with Unknown
5 Conclusions Variance, Technometrics, Vol. 13, No. 1, pp. 87-100.
The table-1 describes that outlier values detected by the 13. Paul S. R, and Karen Y. Fung(1991). A Generalized
five-outlier detection methods. Grubbs and Dixon test had Extreme Studentized Residual Multiple-Outlier-
low sensitivity for outlier detection in the experiment (every Detection Procedure in Linear Regression,
test detected single outlier and find only minimum or Technometrics, Vol. 33, No. 3, pp. 339-348.
maximum value). The other three methods can find single 14. Quesenberry C. P. and David H. A. (1961). Some
experiment to identify the maximum outliers. The methods Tests for Outliers, Biometrika, Vol. 48, No. 3/4, pp.
Hampel, Quartile and Generalized ESD test can find easy 379-390.
and average detection levels are equal to find the maximum 15. Rorabacher, D.B. (1991). Statistical Treatment for
outliers. The result reveals that the three methods (Hampel, Rejection of Deviant Values: Critical Values of
Quartile and Generalized ESD) are much better than Grubbs Dixon Q Parameter and Related Subrange Ratios at
and Dixon test. the 95 percent Confidence Level. Anal. Chem. 83, 2,
139-146.
Acknowledgement 16. Rosner Bernard(1975), On the Detection of many
The first author acknowledges the UGC for outliers, Technometrics, Vol. 17, No. 2 (May, 1975),
awarding the Scheme of Rajiv Gandhi National Fellowship pp. 221-227.
(RGNF) for providing financial support to carry out this 17. Rosner, Bernard (1983), Percentage Points for a
Generalized ESD Many-Outlier Procedure,
IJSER © 2013
http://www.ijser.org
International Journal of Scientific & Engineering Research, Volume 4, Issue 9, September-2013 714
ISSN 2229-5518
IJSER
IJSER © 2013
http://www.ijser.org