UNIT 4             INTRODUCTION TO MEASURES
OF VARIABILITY*
Structure
4.0    Objectives
4.1    Introduction
4.2    Concept of Variability in Data
       4.2.1 Functions of Variability
       4.2.2 Absolute Dispersion and Relative Dispersion
4.3    Different Measures of Variability (Types of Measures of Dispersion of
       Veriability)
       4.3.1 The Range
             4.3.1.1 Merits and Limitations of the Range
             4.3.1.2 Uses of the Range
       4.3.2 The Quartile Deviation
             4.3.2.1 Merits and Limitation of Quartile Deviation
             4.3.2.2 Uses of QuartileDeviation
       4.3.3 The Average Deviation or Mean Deviation
             4.3.3.1 Merits and Limitation of the Average Deviation
             4.3.3.2 Uses of Average Deviation
       4.3.4 The Standard Deviation
             4.3.4.1 Merits and Limitations of the Standard Deviation
             4.3.4.2 Uses of the Standard Deviation
       4.3.5 Variance
             4.3.5.1 Merits and Demerits of Variance
             4.3.5.2 Coefficient of Variance
4.4    Let Us Sum Up
4.5    References
4.6    Key Words
4.7    Answers to Check Your Progress
4.8    Unit End Questions
4.0 OBJECTIVES
After reading this unit, you will be able to:
     explain the concept of variability in data;
     describe the main properties limitation and uses of the range, quartile
      deviation, average deviation and standard deviation; and
     explain variance and coefficient of variance.
4.1 INTRODUCTION
Look at the two data given below:
Data A: 8, 2, 6, 4, 8, 2, 10, 5, 5, 10 (N= 10, Total = 60, Mean = 6)
Data B: 7, 7, 7, 6, 7, 5, 5, 6, 5, 5 (N = 10, Total = 60, Mean = 6)
* Dr. Usha Kulshreshtha, Faculty, Psychology, University of Rajasthan, Jaipur.   89
Measures of        A single glance at the data A and B given above tell us that data A is more
Central Tendency   homogeneous when compared to the data B that seems to display more
and Variability
                   variability. Though to further understand the variance in the data, various
                   measures of variability need to be computed.
                   In the previous unit, we discussed the measures of central tendency, viz, mean,
                   median and mode. These measures give us an average of a set of observations
                   or data. However, the average cannot be a true representation of data because
                   of variations in the distribution. As can be seen in the above example, the mean
                   is same for data A and data B, but the data vary in terms of their deviation from
                   the mean. Thus, it is very important to consider the variations in the data or set
                   of observations. In this unit, we will be explaining the concept of variability
                   (also known as dispersion) in data. Dispersion actually refers to the variations
                   that exist within and amongst the scores obtained by a group. In average there
                   is a convergence of scores towards a mid-point in a normal distribution. In
                   dispersion, we try and see how each score in the group varies from the mean or
                   the average score. The larger the dispersion, less is the homogeneity of the
                   group concerned and if the dispersion is less it means that the group is
                   homogeneous. Dispersion is an important statistic which helps us to know how
                   far the sample population varies from the universe population. It tells us about
                   the standard error of the mean.
                   In the present unit, we will discuss the meaning and significance of variability.
                   The main properties and limitation of the range, quartile deviation, average
                   deviation and standard deviation that are the measures of variability will also
                   be discussed. Further, the concept of variance and coefficient of variance will
                   also be highlighted.
                   4.2 CONCEPT OF VARIABILITY IN DATA
                   Variability in statistics means deviation of scores in a group or series, from
                   their mean scores. It actually refers to the spread of scores in the group in
                   relation to the mean. It is also known as dispersion. For instance, in a group of
                   10 participants who have scored differently on a mathematics test, each
                   individual varies from the other in terms of the marks that he/she has scored.
                   These variations can be measured with the help of measure of variability, that
                   measure the dispersion of different values for the average value or average
                   score. Variability or dispersion also means the scatter of the values in a group.
                   High variability in the distribution means that scores are widely spread and are
                   not homogeneous. Low variability means that the scores are similar and
                   homogeneous and are concentrated in the middle.
                   According to Minium, King and Bear (2001), measures of variability express
                   quantitatively the extent to which the score in a distribution scatter around or
                   cluster together. They describe the spread of an entire set of scores, they do not
                   specify how far a particular score diverges from the centre of the group. These
                   measures of variability do not provide information about the shape of a
                   distribution or the level of performance of a group.
                   Measures of variability fall under descriptive statistics that describe how
                   similar a set of scores are to each other. The greater the similarity of the scores
                   to each other, lower would be the measure of variability or dispersion. The less
                   the similarity of the scores are to each other, higher will be the measure of
                   variability or dispersion. In general, the more the spread of a distribution,
90
larger will be the measure of dispersion. To state it succinctly, the variation       Introduction to
between the data values in a sample is called dispersion. The most commonly              Measures of
                                                                                          Variability
used measures of dispersion are the range, and standard deviation.
In the previous unit, measures of central tendency were discussed. While
measures of central tendencies are indeed very valuable, their usefulness is
rather limited. Although through these measures we can compare the two or
more groups, a measure of central tendency is not sufficient for the comparison
of two or more groups. They do not show how the individual scores are spread
out. Let us take another example, similar to the one that we discussed under the
section on introduction. A math teacher is interested to know the performance
of two groups (A and B) of his /her students. He/she gives them a test of 40
points. The marks obtained by the students of groups A and B in the test are as
follows:
Marks of Group A: 5,4,38,38,20,36,17,19,18,5 (N = 10, Total = 200, Mean =
20)
Marks of Group B: 22,18,19,21,20,23,17,20,18,22 (N = 10, Total = 200, Mean
= 20)
The mean scores of both the groups is 20, as far as mean goes there is no
difference in the performance of the two groups. But there is a difference in the
performance of the two groups in terms of how each individual student varies
in marks from that of the other. For instance, the test scores of group A are
found to range from 5 to 38 and the test scores of group B range from 18 to 23.
It means that some of the students of group A are doing very well, some are
doing very poorly and performance of some of the students is falling at the
average level. On the other hand, the performance of all the students of the
second group is falling within and near about the average (mean) that is 20. It
is evident from this that the measures of central tendency provide us
incomplete picture of a set of data. It gives insufficient base for the comparison
of two or more sets of scores. Thus, in addition to a measure of central
tendency, we need an index of how the scores are scattered around the center
of the distribution. In other words, we need a measure of dispersion or
variability. A measure of central tendency is a summary of scores, and a
measure of dispersion is summary of the spread of scores. Information about
variability is often as important as that about the central tendency.
The term variability or dispersion is also known as the average of the second
degree, because here we consider the arithmetic mean of the deviations from
the mean of the values of the individual items. To describe a distribution
adequately, therefore, we usually must provide a measure of central tendency
and a measure of variability. Measures of variability are important in statistical
inference. With the help of measures of dispersion, we can know about
fluctuation in random sampling. How much fluctuation will occur in random
sampling? This question in fundamental to every problem in statistical
inference, it is a question about variability.
The measures of variability are important for the following purposes:
    Measures of variability are used to test the extent to which an average
     represents the characteristics of a data. If the variation is small then it
     indicates high uniformity of values in the distribution and the average
     represents the characteristics of the data. On the other hand, if variation is                91
Measures of             large then it indicates lower degree of uniformity and unreliable average.
Central Tendency
and Variability        Measures of variability help in identifying the nature and cause of
                        variation. Such information can be useful to control the variation.
                       Measures of variability help in the comparison of the spread in two or
                        more sets of data with respect to their uniformity or consistency.
                       Measures of variability facilitate the use of other statistical techniques
                        such as correlation, regression analysis, and so on.
                   4.2.1 Functions of Variability
                   The major functions of dispersion or variability are as follows:
                       It is used for calculating other statistics such as analysis of variance,
                        degree of correlation, regression etc.
                       It is also used for comparing the variability in the data obtained as in the
                        case of Socio-Economic Status, income, education etc.
                       To find out if the average or the mean/median/mode worked out is
                        reliable. If the variation is small then we could state that the average
                        calculated is reliable, but if variation is too large, then the average may be
                        erroneous.
                       Dispersion gives us an idea if the variability is adversely affecting the
                        data and thus helps in controlling the variability.
                   4.2.2 Absolute Dispersion and Relative Dispersion
                   Measures of dispersion give an estimate and express quantitatively the
                   deviation of individual scores in a given sample from the mean and median.
                   Thus, the numerical measures of variability spread or scatter around a central
                   value.
                   In measuring dispersion, it is imperative to know the amount of variation
                   (absolute measure) and the degree of variation (relative measure). In the former
                   case, we consider the range, mean deviation, standard deviation etc. In the
                   latter case, we consider the coefficient of range, the coefficient of mean
                   deviation, the coefficient of variation etc. Thus, there are two broad classes of
                   the measures of dispersion or variability. They are absolute measure of
                   dispersion and relative measure of dispersion.
                   Absolute dispersion usually refers to the standard deviation, a measure of
                   variation from the mean. The units of standard deviation are the same as for the
                   data. In other words, absolute measure is expressed in terms of the original
                   units of a distribution. Therefore, absolute dispersion is not suitable for
                   comparing the variability of two distributions since the two variables are
                   expressed and measured in two different units. For instance, the variability in
                   body height (cm) and body weight (kg) cannot be compared because the
                   absolute measure (standard deviation) is expressed in cm and kg. The absolute
                   measure is also not appropriate for two sets of scores expressed in the same
                   units with wide divergence in means (central value). Nevertheless, absolute
                   measures are widely used, except in the exceptional cases like above. The
                   absolute measures include range, mean deviation, standard deviation, and
                   variance.
                   Relative dispersion, sometimes called the coefficient of variation, is the result
92
of dividing the standard deviation by the mean and it may be presented as a                                                   Introduction to
quotient or as a percentage. Thus, relative measures are computed from the                                                       Measures of
                                                                                                                                  Variability
absolute measures of dispersion and its corresponding central values. A low
value of relative dispersion usually implies that the standard deviation is small
in comparison to the magnitude of the mean. To give an example, if standard
deviation for mean of 30 marks is 6.0, then the coefficient of variation will be
                                                         6.0/ 30 = 0.2(about 20%)
If the mean is 60 marks and the standard deviation remains the same as 6.0, the
coefficient of variation will be
                                                         6.0 / 60 = 0.1 (10%).
However, with measurements on either side of zero and a mean being close to
zero the relative dispersion could be greater than 1. At the same time, we must
remember that the two distributions in quite a few cases can have the same
variability. Sometimes the distributions may be skewed and not normal with
mean, mode and median at different points in the continuum. These
distributions are called skewed distributions (Skewness will be discussed in
detail in the unit 8). It is also possible to have two distributions that have equal
variability but unequal means or different shapes. Thus, the relative measure is
derived from a ratio of an absolute measure like standard deviation and mean
(measure of central value) and is expressed in percentage of the mean. So, the
relative measure is suitable for comparing the variabilities of two sets of scores
given in different units. They are also preferred in comparing two sets of scores
given in the same unit, when the mean widely diverges. The relative measures
include the coefficient of variation, the coefficient of quartile deviation, and the
coefficient of mean deviation.
Check Your Progress I
1)   State any one function of variability.
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
2)   List the two broad classes of the measures of dispersion.
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
4.3 TYPES OF MEASURES OF DISPERSION OR
    VARIABILITY
The measures of variability most commonly used in psychological statistics are
                                                                                                                                           93
Measures of        as follow:
Central Tendency
and Variability    1)   Range
                   2)   Quartile Deviation
                   3)   Average Deviation or Mean Deviation
                   4)   Standard Deviation
                   5)   Variance
                   Range and quartile deviation measure dispersion by computing the spread
                   within which the values fall, while as average deviation and standard deviation
                   compute the extent to which the values differ from the average. We will
                   introduce each and discuss their properties in detail.
                   4.3.1 The Range (R)
                   Range can be defined as the difference between the highest and lowest score in
                   the distribution. This is calculated by subtracting the lowest score from the
                   highest score in the distribution. The equation is as follows:
                                         Range = Highest Score – Lowest Score(R=H-L)
                   The range is a rough measure of dispersion because it tells about the spread of
                   the extreme scores and not the spread of any of the scores in between. For
                   instance, the range for the distribution 4,10,12,20, 25, 50 will be 50 - 4 = 46.
                   4.3.1.1 Merits and Limitations of the Range
                   Some of the merits of range as a measure of variability are explained in this
                   section.
                   1)   It is easiest to compute when compared with other measures of variability
                        and its meaning is direct.
                   2)   The range is ideal for preliminary work or in other circumstances where
                        precision is not an important requirement (Minium et. al., 2001).
                   3)   It is quite useful in case where the purpose is only to find out the extent
                        of extreme variation, such as temperature, rainfall etc.
                   4)   Range is effectively used in the application of tests of significance with
                        small samples.
                   The following are the limitations of the range as a measure of variability:
                   1)   The calculation of range is based only on two extreme values in the data
                        set and does not consider other values of the data set. Sometimes, the
                        extreme values of the two different data sets may be same or similar, but
                        the two data sets may be differ in dispersion.
                   2)   Its value is sensitive to change in sampling. The range varies more with
                        sampling fluctuation. That is different sample of the same size from the
                        same population may have different range.
                   3)   Its value is influenced by large samples. In many types of distribution,
                        including normal distribution, the range is dependent on sample size. The
94                      sampling variance increases rapidly with increase in sample size.
4)        Range cannot be used for open-ended class intervals since the highest and     Introduction to
          the lowest scores of the distribution are not available and thus the range       Measures of
                                                                                            Variability
          cannot be computed.
5)        Further mathematical calculations are not possible for range.
6)        Range indicates two extreme scores, thus the magnitude or frequency of
          intermediate scores is missing.
7)        It does not indicate the form of distribution, like skewness, kurtosis, or
          modal distribution of scores.
8)        A single extreme score may also increase the range disproportionately.
4.3.1.2 Uses of the Range
Range is applied in diverse areas discussed as follows:
         Range is used in areas where there are small fluctuations, such as stock
          market, rate of exchange, etc.
         Range may be used in day-to-day activities like, daily sales in a grocery
          store, monthly wages in a factory, etc.
         Range is used in weather forecasts, like variation in temperature in a day.
         When the researcher is only interested in the extreme scores or total
          spread of the scores, range is the most useful measure of variability.
         Range can also be used when the data are too scant or too scattered to
          justify the use of most appropriate measure of variability.
4.3.2 The Quartile Deviation (QD)
Since a large number of values in the data lie in the middle of the frequencies
distribution and range depends on the extreme (outliers) of a distribution, we
need another measure of variability. The Quartile deviation, is a measure that
depends on the relatively stable central portion of a distribution. According to
                                                                               th
Garret (1966), the Quartile deviation is half the scale distance between 75 and
     th
25 per cent in a frequency distribution. The entire data is divided into four
equal parts and each part contains 25% of the values. According to Guilford
(1963) the Semi-Interquartile range is the one half the range of the middle 50
percent of the cases.
On the basis of above definitions, it can be said that quartile deviation is half
the distance between Q1 and Q3.
Inter Quartile Range (IQR): The range computed for the middle 50% of the
distribution is the interquartile range. The upper quartile (Q3) and lower
quartile (Q1) is used to compute IQR. This is Q3 – Q1. IQR is not affected by
extreme values.
Semi-Interquartile Range (SIQR) or Quartile Deviation (QD): Half of the
IQR is called as semi inter quartile range. SIQR is also called as quartile
deviation or QD. Thus, QD is computed as;
                                                                                                     95
Measures of                   QD = Q3 – Q1/2
Central Tendency
and Variability    Thus, quartile deviation is obtained by dividing IQR by 2. Quartile deviation is
                   an absolute measure of dispersion and is expressed in the same unit as the
                   scores.
                   Quartile deviation is closely related to the median because median is
                   responsive to the number of scores lying below it rather than to their exact
                   positions and Q1 and Q3 are defined in a same manner. The median and
                   quartile deviation have common properties. Both median and quartile deviation
                   are not affected by extreme values. In a symmetrical distribution, the two
                   quartiles Q1 and Q3 are at equal distance from the median or Q1 =Q3- Median.
                   Thus, like median, quartile deviation covers exactly 50 per cent of observed
                   values in the data. In normal distribution, quartile deviation is called the
                   Probable Error or PE. If the distribution is open-class, then quartile deviation is
                   the only measure of variability that is reasonable to compute.
                   In an asymmetric or skewed distribution, Q1 and Q3 are not equidistant from Q2
                   or median. In such a distribution, the median of the IQR moves towards the
                   skewed tail. The degree and direction of skewness can be assessed from
                   quartile deviation and the relative distance between Q1, Q2 and Q3.
                   Kurtosis is proportional to quartile deviation. Smaller the quartile deviation,
                   greater is the concentration of scores in the middle of the distribution, thus
                   making the distribution with high peak and narrow body. The scores that are
                   widely dispersed indicate a large quartile deviation and thus, long IQR. This
                   distribution has a low peak and broad body.
                   4.3.2.1   Merits and Limitations of Quartile Deviation
                   From the explanation in the above section, it becomes clear that quartile
                   deviation is easy to understand and compute.
                   1)    Quartile deviation is a better measure of dispersion than range because it
                         takes into account 50 per cent of the data, unlike the range which is based
                         on two values of the data, that is highest value and the lowest value.
                   2)    Secondly, quartile deviation is not affected by extreme scores since it
                         does not consider 25 per cent data from the beginning and 25 per cent
                         from the end of the data.
                   3)    Lastly, quartile deviation is the only measure of dispersion which can be
                         computed from the frequency distribution with open-end class.
                   Despite the major merits of quartile deviation, there are limitations to it as well.
                   1)   The value of quartile deviation is based on the middle 50 percent values, it
                        is not based on all the observations. Thus, it is not regarded as a stable
                        measure of variability
                   2)   The value of quartile deviation is affected by sampling fluctuation.
                   3)   The value of quartile deviation is not affected by the distribution of the
                        individual values within the intervals of middle 50 percent observed
                        values.
96
4.3.2.2 Uses of Quartile Deviation                                                   Introduction to
                                                                                        Measures of
1)   The distribution contains few and very extreme scores.                              Variability
2)   When the median is the measure of central tendency.
3)   When our primary interest is to determine the concentration around the
     median.
4.3.3 The Average Deviation (AD) or Mean Deviation (MD)
The two measures of variation, range and quartile deviation, which we
discussed in the earlier subsections, do not show how values of the data are
scattered about a central value. R and QD attempt to compute spread of values
and not compute how far the values are from their average. To measure the
variation, as a degree to which values within a data deviate from their mean,
we use average deviation.
Before discussing average deviation, first we should know about the meaning
of deviation. Deviation score express the location of the scores by indicating
how many score points it lies above or below the mean of the distribution.
Deviation score may be defined as (X-Mean), that is, when we subtract the
means from each of the raw scores the resulting deviation scores states the
position of the scores, relative to the mean.
According to Garrett (1971, as cited in Mangal 2002, page 70) “The average
deviation is the mean of the deviation of all of the separate scores is a series
taken from their mean”. According to Guilford (1963) average deviation can be
described as an average or mean of all the deviations when the algebraic signs
are not taken in to the account.
Average is a central value and thus, some deviations will be positive (+) and
some may be negative (-). Mean deviation ignores the signs of the deviations,
and it considers all the deviations to be positive. This is so because the
algebraic sum of all the deviations from the mean equals to zero. MD or AD is
arithmetic mean of the difference of the values from the average. The average
is either the arithmetic mean or the median. It is a measure of variability that
takes into account the variations of all the scores in the data. It is an absolute
measure of dispersion and is expressed in the same unit as the raw scores.
The calculation of average deviation is easy therefore it is a popular measure.
When we calculate average deviation, equal weight is given to each observed
value and thus it indicates how far each observation lies from the mean. AD or
MD can be obtained from any of the measures of central tendency, that is
mean, median, or mode. Mode is ill defined, hence, AD or MD is computed
about the mean or median. AD or MD calculated about the median will be less
than the AD or MD about the mean or mode. For a symmetrical distribution,
MD about mean and MD about median covers 57.5 per cent of the observations
of the data. Thus, a small value of MD will indicate less variability. AD is thus
somewhat larger (57.5 per cent of the cases) than QD (50 per cent of the cases).
4.3.3.1 Merits and Limitations of the Average Deviation
The main merits of AD are as follows:
1)   AD or MD is easy to understand and compute.
                                                                                                  97
Measures of        2)   It is based on all observations, unlike R or QD.
Central Tendency
and Variability    3)   It is an accurate measure of variability since it averages the absolute
                        deviations.
                   4)   It is less affected by extreme observations.
                   5)   It is based on average thus, it is a better measure to compare about the
                        formations of different distributions.
                   The main limitations of average deviation are as follows:
                   1)   While calculating average deviation we ignore the plus minus sign and
                        consider all values as plus. Because of this mathematical property, it is
                        not used in inferential statistics.
                   2)   AD cannot be computed for open-end classes.
                   3)   It tends to increase with the size of the sample.
                   4.3.3.2 Use of Average Deviation
                   Despite the limitations, AD or MD is used by economists and business
                   statisticians. It is also used in computing the distribution of personal wealth in a
                   community or a nation. According to National Bureau of Economic Research,
                   MD is the most practical measure of dispersion to be used for this purpose
                   (Mohanty and Misra, 2016, pg. 133).
                   1)   When it is desired to weight all deviation from the mean according to
                        their size.
                   2)   When the standard deviation in unduly influenced by the presence of
                        extreme scores.
                   3)   Distribution of the score is not near to normal.
                   4.3.4 The Standard Deviation (SD)
                   The term standard deviation was first used in writing by Karl Pearson in 1894.
                   The standard deviation of population is denoted by ‘σ’ (Greek letter sigma) and
                   that for a sample is ‘s’. A useful property of SD is that unlike variance it is
                   expressed in the same unit as the data. This is most widely used method of
                   variability. The standard deviation indicates the average of distance of all the
                   scores around the mean. It is the positive square root of the mean of squared
                   deviations of all the scores from the mean. It is the positive square root of
                   variance. It is also called as ‘root mean square deviation’. Mangal (2002, page
                   71) defined standard deviation as “as the square root of the average of the
                   squares of the deviations of each score from the mean”. SD is an absolute
                   measure of dispersion and it is the most stable and reliable measure of
                   variability.
                   Standard deviation shows how much variation there is, from the mean. SD is
                   calculated from the mean only. If standard deviation is low it means that the
                   data is close to the mean. A high standard deviation indicates that the data is
                   spread out over a large range of values. Standard deviation may serve as a
                   measure of uncertainty. If you want to test the theory or in other word, want to
                   decide whether measurements agree with a theoretical prediction, the standard
98                 deviation provides the information. If the difference between mean and
standard deviation is very large then the theory being tested probably needs to       Introduction to
be revised. The mean with smaller standard deviation is more reliable than               Measures of
                                                                                          Variability
mean with large standard deviation. A smaller SD shows the homogeneity of
the data. The value of standard deviation is based on every observation in a set
of data. It is the only measure of dispersion capable of algebraic treatment
therefore, SD is used in further statistical analysis.
4.3.4.1 Merits and Limitations of the Standard Deviation
The main merits of using standard deviation are as follows:
1)   It is widely used because it is the best measure of variation by virtue of its
     mathematical characteristics.
2)   It is based on all the observations of the data.
3)   It gives an accurate estimate of population parameter when compared
     with other measures of variation.
4)   SD is least affected by sample fluctuations
5)   It is also possible to calculate combined SD, that is not possible with
     other measures.
6)   Further statistics can be applied on the basis of SD like, correlation,
     regression, tests of significance, etc.
7)   Coefficient of variation is based on mean and SD. It is the most
     appropriate method to compare variability of two or more distributions.
The limitations of SD are as follows:
1)   While calculating standard deviation more weight is given to extreme
     values and less to those, near the means. When we calculate SD, we take
     deviation from mean (X-M) and square these obtained deviations.
     Therefore, large deviations, when squared are proportionally more than
     small deviations. For example, the deviations 2 and 10 are in the ratio of
     1:5 but their square 4 and l00 are in the ratio 1:25.
2)   It is difficult to compute as compared to other measures of dispersion.
4.3.4.2 Uses of Standard Deviation
The uses of standard deviation are as follows:
1)   SD is used when one requires a more reliable and accurate measure of
     variability but it is recommended when the distribution is normal or near
     to normal.
2)   It is used when further statistics like, correlation, regression, tests of
     significance, etc. have to be computed.
4.3.5 Variance
The term variance was used to describe the square of the standard deviation by
R.A. Fisher in 1913. The concept of variance is of great importance in
advanced work where it is possible to split the total into several parts, each
attributable to one of the factors causing variations in their original series.
Variance is a measure of the dispersion of a set of data points around their
mean value. It is a mathematical expectation of the average squared deviations                     99
Measures of        from the mean. The variance (s2) or mean square (MS) is the arithmetic mean
Central Tendency   of the squared deviations of individual scores from their means. In other words,
and Variability
                   it is the mean of the squared deviation of scores.Variance is expressed as V =
                   SD².
                   The variance and the closely related standard deviation are measures that
                   indicate how the scores are spread out in a distribution. In other words, they are
                   measures of variability. The variance is computed as the average squared
                   deviation of each number from its mean.
                   Calculating the variance is an important part of many statistical applications
                   and analysis. It is a good absolute measure of variability and is useful in
                   computation of Analysis of Variance (ANOVA) to find out the significance of
                   differences between sample means.
                   4.3.5.1 Merits and Demerits of Variance
                   The main merits of variance are listed as follows:
                   1)   It is rigidly defined and based on all observations.
                   2)   It is amenable to further algebraic treatment.
                   3)   It is not affected by sampling fluctuations.
                   4)   It is less erratic.
                   The main demerits of variance are listed as follows:
                   1)   It is difficult to understand and calculate.
                   2)   It gives greater weight to extreme values.
                   4.3.5.2 Co-efficient of Variation (CV)
                   The relative measure corresponding to SD is the coefficient of variation. It is a
                   relative measure of dispersion developed by Karl Pearson. When we want to
                   compare the variations (dispersion) of two different series, relative measures of
                   standard deviation must be calculated. This is known as co-efficient of
                   variation or the co-efficient of SD. It is defined as the SD expressed as a
                   percentage of the mean.The coefficient of variation represents the ratio of the
                   standard deviation to the mean, and it is a useful statistic for comparing the
                   degree of variation from one data series to another, even if the means are
                   drastically different from each other. Thus, it is more suitable than SD or
                   variance. It is given as a percentage and is used to compare the consistency or
                   variability of two or more data series.
                   The formula for computing coefficient of variation is as follows:
                   V = 100 × σ/ M
                   Where,
                   V = Variance
                   σ = Standard deviation
                   M = Mean
100                To understand the computation with the help of an example,
If the standard deviation of marks obtained by 10 students in a class                                                         Introduction to
                                                                                                                                 Measures of
test in English is 10 and Mean is 79, then,                                                                                       Variability
     V= 100 × 10 / 79
     = 1000 / 79
     = 12.65
Check Your Progress II
1)   What is range?
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
2)   List the merits of quartile deviation.
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
3)   What is variance?
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
     ......................................................................................................................
                                                                                                                                         101
Measures of
Central Tendency   4.4 LET US SUM UP
and Variability
                   To summarise, the measures of central tendency are not sufficient to describe
                   data. Thus, to describe distribution adequately, we must provide a measure of
                   variability or dispersion.The measures of variability are summary figures that
                   express quantitatively, the extent to which, scores in a distribution scatter
                   around or cluster together. The measures of variability are range, quartile
                   deviation, average deviation, standard deviation and variance. Range is easy to
                   calculate and useful for preliminary work. But this is based on extreme items
                   only, and does not consider intermediate scores. Thus, it is not useful as a
                   descriptive measure. Quartile deviation is related to the median in its
                   properties. It takes into consideration the number of scores lying above or
                   below the outer quartile point but not to their magnitude. This is useful with
                   open ended distribution.The average deviation takes into account the exact
                   position of each score in the distribution. The means deviation gives a more
                   precise measure of the spread of scores but is mathematically inadequate. The
                   average deviation is less affected by sampling fluctuation.The standard
                   deviation is the most stable measure of variability. Standard deviation shows
                   how much the score departs from the mean. It is expressed in original scores
                   unit. Thus, it is most widely used measure of variability in descriptive
                   statistics.The variance (s2) or mean square (MS) is the arithmetic mean of the
                   squared deviations of individual scores from their means. In other words, it the
                   mean of the squared deviation of scores.The relative measure corresponding to
                   SD is the coefficient of variation. It is a useful measure of relative variation.
                   4.5 REFERENCES
                   Garrett, H.E. (1981), Statistics in Psychology and Education, (Tenth edition),
                   Bombay, Vakils Feffer and Simons Ltd.
                   McBride, Dawn M. (2018). The Process of Statistical Analysis in Psychology.
                   Sage. USA
                   Minium, E.W., King, B.M. & Bear. G (2001). Statistical Reasoning in
                   Psychology and Education (3rd edition), Singapore, John Wiley & Sons, Inc.
                   Mohanty, B. & Misra, Santa (2016). Statistics for Behavioural and Social
                   Sciences. Sage. New Delhi.
                   4.6 KEY WORDS
                   Average Deviation or Mean Deviation: A measure of dispersion that gives
                   the average difference (ignoring plus and minus sign) between each item and
                   the mean.
                   Dispersion: The spread or variability is a set of data.
                   Deviation: The difference between raw score and mean.
                   Quartile Deviation: A measure of dispersion that can be obtained by dividing
                   the difference between Q3 and Q1 by two.
102
                   Range: Difference between the largest and smallest value in a data.
Standard deviation: The square root of the variance in a series.                    Introduction to
                                                                                       Measures of
Variance: Variance is a measure of the dispersion of a set of data points               Variability
around their mean value. It is a mathematical expectation of the average
squared deviations from the mean.
4.7 ANSWERS TO CHECK YOUR PROGRESS
Check Your Progress I
1)   State any one function of variability
     Variability is used for calculating other statistics such as analysis of
     variance, degree of correlation, regression etc.
2)   List the two broad classes of the measures of dispersion.
     Absolute dispersion
     Relative dispersion.
Check Your progress II
1)   What is range?
     Range can be defined as the difference between the highest and lowest
     score in the distribution.
2)   List the merits of quartile deviation.
     The merits of quartile deviation are as follows:
         Quartile deviation is a better measure of dispersion than range
          because it takes into account 50 percent of the data, unlike the range
          which is based on two values of the data, that is highest value and the
          lowest value.
         Secondly, quartile deviation is not affected by extreme scores since it
          does not consider 25 percent data from the beginning and 25 percent
          from the end of the data.
         Lastly, quartile deviation is the only measure of dispersion which can
          be computed from the frequency distribution with open-end class.
3)   What is variance?
Variance is a measure of the dispersion of a set of data points around their
mean value. It is a mathematical expectation of the average squared deviations
from the mean. The variance (s2) or mean square (MS) is the arithmetic mean
of the squared deviations of individual scores from their means. In other words,
it is the mean of the squared deviation of scores.
                                                                                               103
Measures of
Central Tendency   4.8 UNIT END QUESTIONS
and Variability
                   1)   Explain the concept and significance of variability.
                   2)   Discuss the merits and limitation of range and quartile deviation
                   3)   List the merits and limitations of standard deviation.
                   4)   Elucidate average deviation or mean deviation.
                   5)   Explain coefficient of variance with example.
104