UNIT 4 INTRODUCTION TO MEASURES
OF VARIABILITY*
Structure
4.0 Objectives
4.1 Introduction
4.2 Concept of Variability in Data
4.2.1 Functions of Variability
4.2.2 Absolute Dispersion and Relative Dispersion
4.3 Different Measures of Variability (Types of Measures of Dispersion of
Veriability)
4.3.1 The Range
4.3.1.1 Merits and Limitations of the Range
4.3.1.2 Uses of the Range
4.3.2 The Quartile Deviation
4.3.2.1 Merits and Limitation of Quartile Deviation
4.3.2.2 Uses of QuartileDeviation
4.3.3 The Average Deviation or Mean Deviation
4.3.3.1 Merits and Limitation of the Average Deviation
4.3.3.2 Uses of Average Deviation
4.3.4 The Standard Deviation
4.3.4.1 Merits and Limitations of the Standard Deviation
4.3.4.2 Uses of the Standard Deviation
4.3.5 Variance
4.3.5.1 Merits and Demerits of Variance
4.3.5.2 Coefficient of Variance
4.4 Let Us Sum Up
4.5 References
4.6 Key Words
4.7 Answers to Check Your Progress
4.8 Unit End Questions
4.0 OBJECTIVES
After reading this unit, you will be able to:
explain the concept of variability in data;
describe the main properties limitation and uses of the range, quartile
deviation, average deviation and standard deviation; and
explain variance and coefficient of variance.
4.1 INTRODUCTION
Look at the two data given below:
Data A: 8, 2, 6, 4, 8, 2, 10, 5, 5, 10 (N= 10, Total = 60, Mean = 6)
Data B: 7, 7, 7, 6, 7, 5, 5, 6, 5, 5 (N = 10, Total = 60, Mean = 6)
* Dr. Usha Kulshreshtha, Faculty, Psychology, University of Rajasthan, Jaipur. 89
Measures of A single glance at the data A and B given above tell us that data A is more
Central Tendency homogeneous when compared to the data B that seems to display more
and Variability
variability. Though to further understand the variance in the data, various
measures of variability need to be computed.
In the previous unit, we discussed the measures of central tendency, viz, mean,
median and mode. These measures give us an average of a set of observations
or data. However, the average cannot be a true representation of data because
of variations in the distribution. As can be seen in the above example, the mean
is same for data A and data B, but the data vary in terms of their deviation from
the mean. Thus, it is very important to consider the variations in the data or set
of observations. In this unit, we will be explaining the concept of variability
(also known as dispersion) in data. Dispersion actually refers to the variations
that exist within and amongst the scores obtained by a group. In average there
is a convergence of scores towards a mid-point in a normal distribution. In
dispersion, we try and see how each score in the group varies from the mean or
the average score. The larger the dispersion, less is the homogeneity of the
group concerned and if the dispersion is less it means that the group is
homogeneous. Dispersion is an important statistic which helps us to know how
far the sample population varies from the universe population. It tells us about
the standard error of the mean.
In the present unit, we will discuss the meaning and significance of variability.
The main properties and limitation of the range, quartile deviation, average
deviation and standard deviation that are the measures of variability will also
be discussed. Further, the concept of variance and coefficient of variance will
also be highlighted.
4.2 CONCEPT OF VARIABILITY IN DATA
Variability in statistics means deviation of scores in a group or series, from
their mean scores. It actually refers to the spread of scores in the group in
relation to the mean. It is also known as dispersion. For instance, in a group of
10 participants who have scored differently on a mathematics test, each
individual varies from the other in terms of the marks that he/she has scored.
These variations can be measured with the help of measure of variability, that
measure the dispersion of different values for the average value or average
score. Variability or dispersion also means the scatter of the values in a group.
High variability in the distribution means that scores are widely spread and are
not homogeneous. Low variability means that the scores are similar and
homogeneous and are concentrated in the middle.
According to Minium, King and Bear (2001), measures of variability express
quantitatively the extent to which the score in a distribution scatter around or
cluster together. They describe the spread of an entire set of scores, they do not
specify how far a particular score diverges from the centre of the group. These
measures of variability do not provide information about the shape of a
distribution or the level of performance of a group.
Measures of variability fall under descriptive statistics that describe how
similar a set of scores are to each other. The greater the similarity of the scores
to each other, lower would be the measure of variability or dispersion. The less
the similarity of the scores are to each other, higher will be the measure of
variability or dispersion. In general, the more the spread of a distribution,
90
larger will be the measure of dispersion. To state it succinctly, the variation Introduction to
between the data values in a sample is called dispersion. The most commonly Measures of
Variability
used measures of dispersion are the range, and standard deviation.
In the previous unit, measures of central tendency were discussed. While
measures of central tendencies are indeed very valuable, their usefulness is
rather limited. Although through these measures we can compare the two or
more groups, a measure of central tendency is not sufficient for the comparison
of two or more groups. They do not show how the individual scores are spread
out. Let us take another example, similar to the one that we discussed under the
section on introduction. A math teacher is interested to know the performance
of two groups (A and B) of his /her students. He/she gives them a test of 40
points. The marks obtained by the students of groups A and B in the test are as
follows:
Marks of Group A: 5,4,38,38,20,36,17,19,18,5 (N = 10, Total = 200, Mean =
20)
Marks of Group B: 22,18,19,21,20,23,17,20,18,22 (N = 10, Total = 200, Mean
= 20)
The mean scores of both the groups is 20, as far as mean goes there is no
difference in the performance of the two groups. But there is a difference in the
performance of the two groups in terms of how each individual student varies
in marks from that of the other. For instance, the test scores of group A are
found to range from 5 to 38 and the test scores of group B range from 18 to 23.
It means that some of the students of group A are doing very well, some are
doing very poorly and performance of some of the students is falling at the
average level. On the other hand, the performance of all the students of the
second group is falling within and near about the average (mean) that is 20. It
is evident from this that the measures of central tendency provide us
incomplete picture of a set of data. It gives insufficient base for the comparison
of two or more sets of scores. Thus, in addition to a measure of central
tendency, we need an index of how the scores are scattered around the center
of the distribution. In other words, we need a measure of dispersion or
variability. A measure of central tendency is a summary of scores, and a
measure of dispersion is summary of the spread of scores. Information about
variability is often as important as that about the central tendency.
The term variability or dispersion is also known as the average of the second
degree, because here we consider the arithmetic mean of the deviations from
the mean of the values of the individual items. To describe a distribution
adequately, therefore, we usually must provide a measure of central tendency
and a measure of variability. Measures of variability are important in statistical
inference. With the help of measures of dispersion, we can know about
fluctuation in random sampling. How much fluctuation will occur in random
sampling? This question in fundamental to every problem in statistical
inference, it is a question about variability.
The measures of variability are important for the following purposes:
Measures of variability are used to test the extent to which an average
represents the characteristics of a data. If the variation is small then it
indicates high uniformity of values in the distribution and the average
represents the characteristics of the data. On the other hand, if variation is 91
Measures of large then it indicates lower degree of uniformity and unreliable average.
Central Tendency
and Variability Measures of variability help in identifying the nature and cause of
variation. Such information can be useful to control the variation.
Measures of variability help in the comparison of the spread in two or
more sets of data with respect to their uniformity or consistency.
Measures of variability facilitate the use of other statistical techniques
such as correlation, regression analysis, and so on.
4.2.1 Functions of Variability
The major functions of dispersion or variability are as follows:
It is used for calculating other statistics such as analysis of variance,
degree of correlation, regression etc.
It is also used for comparing the variability in the data obtained as in the
case of Socio-Economic Status, income, education etc.
To find out if the average or the mean/median/mode worked out is
reliable. If the variation is small then we could state that the average
calculated is reliable, but if variation is too large, then the average may be
erroneous.
Dispersion gives us an idea if the variability is adversely affecting the
data and thus helps in controlling the variability.
4.2.2 Absolute Dispersion and Relative Dispersion
Measures of dispersion give an estimate and express quantitatively the
deviation of individual scores in a given sample from the mean and median.
Thus, the numerical measures of variability spread or scatter around a central
value.
In measuring dispersion, it is imperative to know the amount of variation
(absolute measure) and the degree of variation (relative measure). In the former
case, we consider the range, mean deviation, standard deviation etc. In the
latter case, we consider the coefficient of range, the coefficient of mean
deviation, the coefficient of variation etc. Thus, there are two broad classes of
the measures of dispersion or variability. They are absolute measure of
dispersion and relative measure of dispersion.
Absolute dispersion usually refers to the standard deviation, a measure of
variation from the mean. The units of standard deviation are the same as for the
data. In other words, absolute measure is expressed in terms of the original
units of a distribution. Therefore, absolute dispersion is not suitable for
comparing the variability of two distributions since the two variables are
expressed and measured in two different units. For instance, the variability in
body height (cm) and body weight (kg) cannot be compared because the
absolute measure (standard deviation) is expressed in cm and kg. The absolute
measure is also not appropriate for two sets of scores expressed in the same
units with wide divergence in means (central value). Nevertheless, absolute
measures are widely used, except in the exceptional cases like above. The
absolute measures include range, mean deviation, standard deviation, and
variance.
Relative dispersion, sometimes called the coefficient of variation, is the result
92
of dividing the standard deviation by the mean and it may be presented as a Introduction to
quotient or as a percentage. Thus, relative measures are computed from the Measures of
Variability
absolute measures of dispersion and its corresponding central values. A low
value of relative dispersion usually implies that the standard deviation is small
in comparison to the magnitude of the mean. To give an example, if standard
deviation for mean of 30 marks is 6.0, then the coefficient of variation will be
6.0/ 30 = 0.2(about 20%)
If the mean is 60 marks and the standard deviation remains the same as 6.0, the
coefficient of variation will be
6.0 / 60 = 0.1 (10%).
However, with measurements on either side of zero and a mean being close to
zero the relative dispersion could be greater than 1. At the same time, we must
remember that the two distributions in quite a few cases can have the same
variability. Sometimes the distributions may be skewed and not normal with
mean, mode and median at different points in the continuum. These
distributions are called skewed distributions (Skewness will be discussed in
detail in the unit 8). It is also possible to have two distributions that have equal
variability but unequal means or different shapes. Thus, the relative measure is
derived from a ratio of an absolute measure like standard deviation and mean
(measure of central value) and is expressed in percentage of the mean. So, the
relative measure is suitable for comparing the variabilities of two sets of scores
given in different units. They are also preferred in comparing two sets of scores
given in the same unit, when the mean widely diverges. The relative measures
include the coefficient of variation, the coefficient of quartile deviation, and the
coefficient of mean deviation.
Check Your Progress I
1) State any one function of variability.
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
2) List the two broad classes of the measures of dispersion.
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
4.3 TYPES OF MEASURES OF DISPERSION OR
VARIABILITY
The measures of variability most commonly used in psychological statistics are
93
Measures of as follow:
Central Tendency
and Variability 1) Range
2) Quartile Deviation
3) Average Deviation or Mean Deviation
4) Standard Deviation
5) Variance
Range and quartile deviation measure dispersion by computing the spread
within which the values fall, while as average deviation and standard deviation
compute the extent to which the values differ from the average. We will
introduce each and discuss their properties in detail.
4.3.1 The Range (R)
Range can be defined as the difference between the highest and lowest score in
the distribution. This is calculated by subtracting the lowest score from the
highest score in the distribution. The equation is as follows:
Range = Highest Score – Lowest Score(R=H-L)
The range is a rough measure of dispersion because it tells about the spread of
the extreme scores and not the spread of any of the scores in between. For
instance, the range for the distribution 4,10,12,20, 25, 50 will be 50 - 4 = 46.
4.3.1.1 Merits and Limitations of the Range
Some of the merits of range as a measure of variability are explained in this
section.
1) It is easiest to compute when compared with other measures of variability
and its meaning is direct.
2) The range is ideal for preliminary work or in other circumstances where
precision is not an important requirement (Minium et. al., 2001).
3) It is quite useful in case where the purpose is only to find out the extent
of extreme variation, such as temperature, rainfall etc.
4) Range is effectively used in the application of tests of significance with
small samples.
The following are the limitations of the range as a measure of variability:
1) The calculation of range is based only on two extreme values in the data
set and does not consider other values of the data set. Sometimes, the
extreme values of the two different data sets may be same or similar, but
the two data sets may be differ in dispersion.
2) Its value is sensitive to change in sampling. The range varies more with
sampling fluctuation. That is different sample of the same size from the
same population may have different range.
3) Its value is influenced by large samples. In many types of distribution,
including normal distribution, the range is dependent on sample size. The
94 sampling variance increases rapidly with increase in sample size.
4) Range cannot be used for open-ended class intervals since the highest and Introduction to
the lowest scores of the distribution are not available and thus the range Measures of
Variability
cannot be computed.
5) Further mathematical calculations are not possible for range.
6) Range indicates two extreme scores, thus the magnitude or frequency of
intermediate scores is missing.
7) It does not indicate the form of distribution, like skewness, kurtosis, or
modal distribution of scores.
8) A single extreme score may also increase the range disproportionately.
4.3.1.2 Uses of the Range
Range is applied in diverse areas discussed as follows:
Range is used in areas where there are small fluctuations, such as stock
market, rate of exchange, etc.
Range may be used in day-to-day activities like, daily sales in a grocery
store, monthly wages in a factory, etc.
Range is used in weather forecasts, like variation in temperature in a day.
When the researcher is only interested in the extreme scores or total
spread of the scores, range is the most useful measure of variability.
Range can also be used when the data are too scant or too scattered to
justify the use of most appropriate measure of variability.
4.3.2 The Quartile Deviation (QD)
Since a large number of values in the data lie in the middle of the frequencies
distribution and range depends on the extreme (outliers) of a distribution, we
need another measure of variability. The Quartile deviation, is a measure that
depends on the relatively stable central portion of a distribution. According to
th
Garret (1966), the Quartile deviation is half the scale distance between 75 and
th
25 per cent in a frequency distribution. The entire data is divided into four
equal parts and each part contains 25% of the values. According to Guilford
(1963) the Semi-Interquartile range is the one half the range of the middle 50
percent of the cases.
On the basis of above definitions, it can be said that quartile deviation is half
the distance between Q1 and Q3.
Inter Quartile Range (IQR): The range computed for the middle 50% of the
distribution is the interquartile range. The upper quartile (Q3) and lower
quartile (Q1) is used to compute IQR. This is Q3 – Q1. IQR is not affected by
extreme values.
Semi-Interquartile Range (SIQR) or Quartile Deviation (QD): Half of the
IQR is called as semi inter quartile range. SIQR is also called as quartile
deviation or QD. Thus, QD is computed as;
95
Measures of QD = Q3 – Q1/2
Central Tendency
and Variability Thus, quartile deviation is obtained by dividing IQR by 2. Quartile deviation is
an absolute measure of dispersion and is expressed in the same unit as the
scores.
Quartile deviation is closely related to the median because median is
responsive to the number of scores lying below it rather than to their exact
positions and Q1 and Q3 are defined in a same manner. The median and
quartile deviation have common properties. Both median and quartile deviation
are not affected by extreme values. In a symmetrical distribution, the two
quartiles Q1 and Q3 are at equal distance from the median or Q1 =Q3- Median.
Thus, like median, quartile deviation covers exactly 50 per cent of observed
values in the data. In normal distribution, quartile deviation is called the
Probable Error or PE. If the distribution is open-class, then quartile deviation is
the only measure of variability that is reasonable to compute.
In an asymmetric or skewed distribution, Q1 and Q3 are not equidistant from Q2
or median. In such a distribution, the median of the IQR moves towards the
skewed tail. The degree and direction of skewness can be assessed from
quartile deviation and the relative distance between Q1, Q2 and Q3.
Kurtosis is proportional to quartile deviation. Smaller the quartile deviation,
greater is the concentration of scores in the middle of the distribution, thus
making the distribution with high peak and narrow body. The scores that are
widely dispersed indicate a large quartile deviation and thus, long IQR. This
distribution has a low peak and broad body.
4.3.2.1 Merits and Limitations of Quartile Deviation
From the explanation in the above section, it becomes clear that quartile
deviation is easy to understand and compute.
1) Quartile deviation is a better measure of dispersion than range because it
takes into account 50 per cent of the data, unlike the range which is based
on two values of the data, that is highest value and the lowest value.
2) Secondly, quartile deviation is not affected by extreme scores since it
does not consider 25 per cent data from the beginning and 25 per cent
from the end of the data.
3) Lastly, quartile deviation is the only measure of dispersion which can be
computed from the frequency distribution with open-end class.
Despite the major merits of quartile deviation, there are limitations to it as well.
1) The value of quartile deviation is based on the middle 50 percent values, it
is not based on all the observations. Thus, it is not regarded as a stable
measure of variability
2) The value of quartile deviation is affected by sampling fluctuation.
3) The value of quartile deviation is not affected by the distribution of the
individual values within the intervals of middle 50 percent observed
values.
96
4.3.2.2 Uses of Quartile Deviation Introduction to
Measures of
1) The distribution contains few and very extreme scores. Variability
2) When the median is the measure of central tendency.
3) When our primary interest is to determine the concentration around the
median.
4.3.3 The Average Deviation (AD) or Mean Deviation (MD)
The two measures of variation, range and quartile deviation, which we
discussed in the earlier subsections, do not show how values of the data are
scattered about a central value. R and QD attempt to compute spread of values
and not compute how far the values are from their average. To measure the
variation, as a degree to which values within a data deviate from their mean,
we use average deviation.
Before discussing average deviation, first we should know about the meaning
of deviation. Deviation score express the location of the scores by indicating
how many score points it lies above or below the mean of the distribution.
Deviation score may be defined as (X-Mean), that is, when we subtract the
means from each of the raw scores the resulting deviation scores states the
position of the scores, relative to the mean.
According to Garrett (1971, as cited in Mangal 2002, page 70) “The average
deviation is the mean of the deviation of all of the separate scores is a series
taken from their mean”. According to Guilford (1963) average deviation can be
described as an average or mean of all the deviations when the algebraic signs
are not taken in to the account.
Average is a central value and thus, some deviations will be positive (+) and
some may be negative (-). Mean deviation ignores the signs of the deviations,
and it considers all the deviations to be positive. This is so because the
algebraic sum of all the deviations from the mean equals to zero. MD or AD is
arithmetic mean of the difference of the values from the average. The average
is either the arithmetic mean or the median. It is a measure of variability that
takes into account the variations of all the scores in the data. It is an absolute
measure of dispersion and is expressed in the same unit as the raw scores.
The calculation of average deviation is easy therefore it is a popular measure.
When we calculate average deviation, equal weight is given to each observed
value and thus it indicates how far each observation lies from the mean. AD or
MD can be obtained from any of the measures of central tendency, that is
mean, median, or mode. Mode is ill defined, hence, AD or MD is computed
about the mean or median. AD or MD calculated about the median will be less
than the AD or MD about the mean or mode. For a symmetrical distribution,
MD about mean and MD about median covers 57.5 per cent of the observations
of the data. Thus, a small value of MD will indicate less variability. AD is thus
somewhat larger (57.5 per cent of the cases) than QD (50 per cent of the cases).
4.3.3.1 Merits and Limitations of the Average Deviation
The main merits of AD are as follows:
1) AD or MD is easy to understand and compute.
97
Measures of 2) It is based on all observations, unlike R or QD.
Central Tendency
and Variability 3) It is an accurate measure of variability since it averages the absolute
deviations.
4) It is less affected by extreme observations.
5) It is based on average thus, it is a better measure to compare about the
formations of different distributions.
The main limitations of average deviation are as follows:
1) While calculating average deviation we ignore the plus minus sign and
consider all values as plus. Because of this mathematical property, it is
not used in inferential statistics.
2) AD cannot be computed for open-end classes.
3) It tends to increase with the size of the sample.
4.3.3.2 Use of Average Deviation
Despite the limitations, AD or MD is used by economists and business
statisticians. It is also used in computing the distribution of personal wealth in a
community or a nation. According to National Bureau of Economic Research,
MD is the most practical measure of dispersion to be used for this purpose
(Mohanty and Misra, 2016, pg. 133).
1) When it is desired to weight all deviation from the mean according to
their size.
2) When the standard deviation in unduly influenced by the presence of
extreme scores.
3) Distribution of the score is not near to normal.
4.3.4 The Standard Deviation (SD)
The term standard deviation was first used in writing by Karl Pearson in 1894.
The standard deviation of population is denoted by ‘σ’ (Greek letter sigma) and
that for a sample is ‘s’. A useful property of SD is that unlike variance it is
expressed in the same unit as the data. This is most widely used method of
variability. The standard deviation indicates the average of distance of all the
scores around the mean. It is the positive square root of the mean of squared
deviations of all the scores from the mean. It is the positive square root of
variance. It is also called as ‘root mean square deviation’. Mangal (2002, page
71) defined standard deviation as “as the square root of the average of the
squares of the deviations of each score from the mean”. SD is an absolute
measure of dispersion and it is the most stable and reliable measure of
variability.
Standard deviation shows how much variation there is, from the mean. SD is
calculated from the mean only. If standard deviation is low it means that the
data is close to the mean. A high standard deviation indicates that the data is
spread out over a large range of values. Standard deviation may serve as a
measure of uncertainty. If you want to test the theory or in other word, want to
decide whether measurements agree with a theoretical prediction, the standard
98 deviation provides the information. If the difference between mean and
standard deviation is very large then the theory being tested probably needs to Introduction to
be revised. The mean with smaller standard deviation is more reliable than Measures of
Variability
mean with large standard deviation. A smaller SD shows the homogeneity of
the data. The value of standard deviation is based on every observation in a set
of data. It is the only measure of dispersion capable of algebraic treatment
therefore, SD is used in further statistical analysis.
4.3.4.1 Merits and Limitations of the Standard Deviation
The main merits of using standard deviation are as follows:
1) It is widely used because it is the best measure of variation by virtue of its
mathematical characteristics.
2) It is based on all the observations of the data.
3) It gives an accurate estimate of population parameter when compared
with other measures of variation.
4) SD is least affected by sample fluctuations
5) It is also possible to calculate combined SD, that is not possible with
other measures.
6) Further statistics can be applied on the basis of SD like, correlation,
regression, tests of significance, etc.
7) Coefficient of variation is based on mean and SD. It is the most
appropriate method to compare variability of two or more distributions.
The limitations of SD are as follows:
1) While calculating standard deviation more weight is given to extreme
values and less to those, near the means. When we calculate SD, we take
deviation from mean (X-M) and square these obtained deviations.
Therefore, large deviations, when squared are proportionally more than
small deviations. For example, the deviations 2 and 10 are in the ratio of
1:5 but their square 4 and l00 are in the ratio 1:25.
2) It is difficult to compute as compared to other measures of dispersion.
4.3.4.2 Uses of Standard Deviation
The uses of standard deviation are as follows:
1) SD is used when one requires a more reliable and accurate measure of
variability but it is recommended when the distribution is normal or near
to normal.
2) It is used when further statistics like, correlation, regression, tests of
significance, etc. have to be computed.
4.3.5 Variance
The term variance was used to describe the square of the standard deviation by
R.A. Fisher in 1913. The concept of variance is of great importance in
advanced work where it is possible to split the total into several parts, each
attributable to one of the factors causing variations in their original series.
Variance is a measure of the dispersion of a set of data points around their
mean value. It is a mathematical expectation of the average squared deviations 99
Measures of from the mean. The variance (s2) or mean square (MS) is the arithmetic mean
Central Tendency of the squared deviations of individual scores from their means. In other words,
and Variability
it is the mean of the squared deviation of scores.Variance is expressed as V =
SD².
The variance and the closely related standard deviation are measures that
indicate how the scores are spread out in a distribution. In other words, they are
measures of variability. The variance is computed as the average squared
deviation of each number from its mean.
Calculating the variance is an important part of many statistical applications
and analysis. It is a good absolute measure of variability and is useful in
computation of Analysis of Variance (ANOVA) to find out the significance of
differences between sample means.
4.3.5.1 Merits and Demerits of Variance
The main merits of variance are listed as follows:
1) It is rigidly defined and based on all observations.
2) It is amenable to further algebraic treatment.
3) It is not affected by sampling fluctuations.
4) It is less erratic.
The main demerits of variance are listed as follows:
1) It is difficult to understand and calculate.
2) It gives greater weight to extreme values.
4.3.5.2 Co-efficient of Variation (CV)
The relative measure corresponding to SD is the coefficient of variation. It is a
relative measure of dispersion developed by Karl Pearson. When we want to
compare the variations (dispersion) of two different series, relative measures of
standard deviation must be calculated. This is known as co-efficient of
variation or the co-efficient of SD. It is defined as the SD expressed as a
percentage of the mean.The coefficient of variation represents the ratio of the
standard deviation to the mean, and it is a useful statistic for comparing the
degree of variation from one data series to another, even if the means are
drastically different from each other. Thus, it is more suitable than SD or
variance. It is given as a percentage and is used to compare the consistency or
variability of two or more data series.
The formula for computing coefficient of variation is as follows:
V = 100 × σ/ M
Where,
V = Variance
σ = Standard deviation
M = Mean
100 To understand the computation with the help of an example,
If the standard deviation of marks obtained by 10 students in a class Introduction to
Measures of
test in English is 10 and Mean is 79, then, Variability
V= 100 × 10 / 79
= 1000 / 79
= 12.65
Check Your Progress II
1) What is range?
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
2) List the merits of quartile deviation.
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
3) What is variance?
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
......................................................................................................................
101
Measures of
Central Tendency 4.4 LET US SUM UP
and Variability
To summarise, the measures of central tendency are not sufficient to describe
data. Thus, to describe distribution adequately, we must provide a measure of
variability or dispersion.The measures of variability are summary figures that
express quantitatively, the extent to which, scores in a distribution scatter
around or cluster together. The measures of variability are range, quartile
deviation, average deviation, standard deviation and variance. Range is easy to
calculate and useful for preliminary work. But this is based on extreme items
only, and does not consider intermediate scores. Thus, it is not useful as a
descriptive measure. Quartile deviation is related to the median in its
properties. It takes into consideration the number of scores lying above or
below the outer quartile point but not to their magnitude. This is useful with
open ended distribution.The average deviation takes into account the exact
position of each score in the distribution. The means deviation gives a more
precise measure of the spread of scores but is mathematically inadequate. The
average deviation is less affected by sampling fluctuation.The standard
deviation is the most stable measure of variability. Standard deviation shows
how much the score departs from the mean. It is expressed in original scores
unit. Thus, it is most widely used measure of variability in descriptive
statistics.The variance (s2) or mean square (MS) is the arithmetic mean of the
squared deviations of individual scores from their means. In other words, it the
mean of the squared deviation of scores.The relative measure corresponding to
SD is the coefficient of variation. It is a useful measure of relative variation.
4.5 REFERENCES
Garrett, H.E. (1981), Statistics in Psychology and Education, (Tenth edition),
Bombay, Vakils Feffer and Simons Ltd.
McBride, Dawn M. (2018). The Process of Statistical Analysis in Psychology.
Sage. USA
Minium, E.W., King, B.M. & Bear. G (2001). Statistical Reasoning in
Psychology and Education (3rd edition), Singapore, John Wiley & Sons, Inc.
Mohanty, B. & Misra, Santa (2016). Statistics for Behavioural and Social
Sciences. Sage. New Delhi.
4.6 KEY WORDS
Average Deviation or Mean Deviation: A measure of dispersion that gives
the average difference (ignoring plus and minus sign) between each item and
the mean.
Dispersion: The spread or variability is a set of data.
Deviation: The difference between raw score and mean.
Quartile Deviation: A measure of dispersion that can be obtained by dividing
the difference between Q3 and Q1 by two.
102
Range: Difference between the largest and smallest value in a data.
Standard deviation: The square root of the variance in a series. Introduction to
Measures of
Variance: Variance is a measure of the dispersion of a set of data points Variability
around their mean value. It is a mathematical expectation of the average
squared deviations from the mean.
4.7 ANSWERS TO CHECK YOUR PROGRESS
Check Your Progress I
1) State any one function of variability
Variability is used for calculating other statistics such as analysis of
variance, degree of correlation, regression etc.
2) List the two broad classes of the measures of dispersion.
Absolute dispersion
Relative dispersion.
Check Your progress II
1) What is range?
Range can be defined as the difference between the highest and lowest
score in the distribution.
2) List the merits of quartile deviation.
The merits of quartile deviation are as follows:
Quartile deviation is a better measure of dispersion than range
because it takes into account 50 percent of the data, unlike the range
which is based on two values of the data, that is highest value and the
lowest value.
Secondly, quartile deviation is not affected by extreme scores since it
does not consider 25 percent data from the beginning and 25 percent
from the end of the data.
Lastly, quartile deviation is the only measure of dispersion which can
be computed from the frequency distribution with open-end class.
3) What is variance?
Variance is a measure of the dispersion of a set of data points around their
mean value. It is a mathematical expectation of the average squared deviations
from the mean. The variance (s2) or mean square (MS) is the arithmetic mean
of the squared deviations of individual scores from their means. In other words,
it is the mean of the squared deviation of scores.
103
Measures of
Central Tendency 4.8 UNIT END QUESTIONS
and Variability
1) Explain the concept and significance of variability.
2) Discuss the merits and limitation of range and quartile deviation
3) List the merits and limitations of standard deviation.
4) Elucidate average deviation or mean deviation.
5) Explain coefficient of variance with example.
104