4.
0 MEASURES OF CENTRAL TENDENCY
For any set of data, a measure of central tendency is a measure of how the data tends to a
central value. It is a typical value such that each individual value in the distribution tends
to cluster around it.
In other words, it is an index used to describe the concentration of values near the middle
of the distribution. Measures of central tendency are very useful parameters because they
describe properties of populations. The word ‘average’, which is commonly used, refers to
the ‘centre’ of a data set. It is a single value intended to represent the distribution as a
whole. Three types of averages are common, they are the mean, the median and the mode.
4.1 THE MEAN
The mean is the most commonly used and also of the greatest importance out of the three
averages. There are various types of means. We shall however consider the arithmetic
mean, the geometric mean and the harmonic mean.
(A) The arithmetic mean
The arithmetic mean of a series of data is obtained by taking the ratio of the total (sum) of
all the data in the series to the number of data points in the series. The arithmetic mean or
simply the mean is a representative value of the series that is such that all elements would
obtain if the total were shared equally among them.
(a) The mean for ungrouped data
(i) For a set of n items 𝑥1 , 𝑥2 , … , 𝑥𝑛 the mean x (read x bar)
x
x
n
Where ∑ (read: “sigma”), an uppercase Greek letter denotes the summation over values of
x and n is the number of values under consideration.
Example
Find the mean of the numbers 3, 4, 6, 7.
Solution
X1 = 3, X2 = 4, X3 = 6, X4 = 7, N=4
x 3 467
x = = 20/4 = 5
n 4
4.1.1 The Coding Method
The coding method sometimes called the assumed mean method is a simplified version of
calculating the arithmetic mean. The computational procedure is as follows.
1|Page
(i) Assume a value within the data set as the mean, that is the assumed mean x a
(ii) Obtain the deviation of each observation within the data set from the mean.
(iii) Calculate the mean of the deviations from the assumed mean x d
(iv) Calculate the original mean defined as x xa xd
Example
Calculate the mean of the following numbers 3, 4, 6, 7 using the assumed mean method
Solution
Let the assumed mean xa = 3
X D = x - xa
3 0
4 1
6 3
7 4
D
xd = 0 + 1 + 3+ 4 = 2.0
n
4
But x = xa + x d = 3.0 + 2.0 = 5.0
(b) The mean for grouped data
If 𝑥1 , 𝑥2 , … , 𝑥𝑘 are data points (or midpoints) and 𝑓1 , 𝑓2 , … , 𝑓𝑘 represent the frequencies
then,
f1 x 1 f 2 x 2 ... f k x k
x =
f1 f 2 ... f k
fx
=
f
2|Page
Example
The table below shows the monthly wage of twenty employees of ABC Ventures Ltd.
Monthly wage No of employees 𝑓(𝑥)
(N’000) (𝑥) (𝑓)
5 4 20
10 7 70
15 3 45
20 5 100
25 1 25
- 20 260
Solution
fx 260
x 13
f 20
i.e N 13,000 is the average monthly wage of employees of ABC Ventures Ltd.
Example
The distribution below shows the life of some high powered electric bulbs measured in
hundreds of hours
Class Interval No of tubes (𝒇) x 𝒇 (𝒙)
1–5 5 3 15
6 – 10 15 8 120
11 – 15 18 13 234
16 – 20 20 18 360
21 – 25 25 23 575
26 – 30 9 28 252
31 – 35 5 33 165
36 – 40 3 38 114
Total 100 - 1835
Solution
fx 1835
x 18.35
f 100
3|Page
The short-cut method may be used in computing the arithmetic mean. For a simple
frequency distribution,
fd
x = xa + x d where x d =
f
For a grouped frequency distribution, with constant factor (i.e equal class interval c) then
fd 1
x = xa + x d where x d = C
f
x-x
and d1 =
C
Example
Calculate the mean wage of workers shown in the table below using the assumed mean
method
Wage (𝒙) No of (𝒇) Employees 𝒅 = 𝒙 − xa 𝒇𝒅
5 4 -10 -40
10 7 -5 -35
15 3 0 0
20 5 5 25
25 1 10 10
Total 20 - -40
Solution
Take xa = 15
fd 40
xd - -2
f 20
But x = xa + x d
= 15 – 2 = 13
4|Page
Example
Calculate the mean of the distribution below using the assumed mean method.
Class Interval No of Tubes (𝒇) Class Mark (𝒙) x - xa 𝒇d1
d
1
1–5 5 3 -4 -20
6 – 10 15 8 -3 -45
11 – 15 18 13 -2 -36
16 – 20 20 18 -1 -20
21 – 25 25 23 0 0
26 – 30 9 28 1 9
31 – 35 5 33 2 10
36 – 40 3 38 3 3
Total 100 - - -93
Take xa = 23, C = 5
fd - 93
xd = C 5 - 4.65
f 100
xd = 23 – 4.65
= 18.35
Advantages of the arithmetic Mean
(i) It is simple to understand and compute
(ii) It is fully representative since it considers all items observed.
(iii) It can be measured with mathematical exactness. This makes it applicable in
advanced statistical analysis.
Disadvantages of the arithmetic Mean
(i) Extreme values affect its result.
(ii) It may not be a physically possible value corresponding to the variable.
(iii) Computational complications may arise for unbounded classes.
(iv) No graphical method can be used to estimate its value.
(v) It is meaningless for qualitative classified data.
5|Page
4.2 THE GEOMETRIC MEAN (G.M)
The G.M is an analytical method of finding the average rate of growth or decline in the
values of an item over a particular period of time. The geometric mean of a set of number
𝑥1 , 𝑥2 , … , 𝑥𝑛 is the nth root of the product of the number.
Thus
𝑛
𝐺. 𝑀 = √(𝑥1 𝑋 𝑥2 𝑋 … 𝑋 𝑥𝑖 )
If 𝑓𝑖 is the frequency of 𝑥𝑖 , then
𝑛
𝐺. 𝑀 = √(𝑥1 𝑓1 𝑋 𝑥2 𝑓2 𝑋 … 𝑋 𝑥𝑖 𝑓𝑖 )
Example: If ages of four pupils are 3, 4, 6 and 7 years, determine the geometric mean of
the ages.
4
𝐺. 𝑀 = √(3 𝑥 4 𝑥 6 𝑥 7)
4
𝐺. 𝑀 = √504
𝐺. 𝑀 = 4.738137221 ≈ 4.7
4.3 THE HARMONIC MEAN (H.M)
The H.M of a set of numbers 𝑥1 , 𝑥2 , … , 𝑥𝑛 is the reciprocal of the arithmetic mean of the
reciprocals of the numbers. It is used when dealing with the rates of the type 𝑥 per 𝑑
(such as kilometers per hour, Naira per liter).
The formula is expressed thus:
1 𝑛
𝐻. 𝑀 = =
1 𝑛 1 1
∑ ∑𝑛𝑖=1
𝑛 𝑖=1 𝑥𝑖 𝑥𝑖
If x has frequency f, then
𝑛
𝐻. 𝑀 =
𝑓
∑
𝑥
Example: Find the harmonic mean of 3, 4, 6, 7.
Solution:
4 4 12
𝐻. 𝑀 = = =4 = 4.48
1 1 1 1 25 25
3+4+6+7 28
6|Page
Note:
(i) Calculation takes into account every value
(ii) Extreme values have least effect
(iii) The formula breaks down when “0” is one of the observations.
Relation between Arithmetic mean, Geometric and Harmonic
In general, the geometric mean for a set of data is always less than or equal to the
corresponding arithmetic mean but greater than or equal to the harmonic mean.
That is, 𝐻. 𝑀 ≤ 𝐺. 𝑀 ≤ 𝐴. 𝑀
The equality signs hold only if all the observations are identical.
4.4 The Trimmed Mean
The trimmed mean is a family of measures of central tendency. The α % - trimmed mean of N
values 𝑥1 , 𝑥2 , … , 𝑥𝑛 is computed by sorting all the N values, discarding α % of the smallest and
α % of the largest values, and computing the mean of the remaining values.
For example, to calculate the 20% - trimmed mean for a set of N=5 values (32,10,8,9,11), the
following steps are helpful.
Step 1. Sort the values : 8,9,10,11,32.
Step 2. Discard 20% of the largest value – i.e 20% of the largest values –one (20% of 5) largest
value (32); discard 20% of the smallest values – i.e one smallest value (8). Then we have a set of
three values. (9,10,11)
Step 3. Compute the mean of the three values (9,10,11) is 10.
Thus the 20 % - trimmed mean of 5 values (32,10,8,9,11) is 10.
Arithmetic mean for a set of N=5 values (32,10,8,9,11) is 14
In contrast to the arithmetic mean, the trimmed mean is a robust measure of central tendency. For
example, a small fraction of anomalous measurements with abnormally large deviation from the
center may change the mean value substantially. At the same time, the trimmed mean is stable in
respect to presence of such abnormal extreme values, which get trimmed away.
For example, in the set of 5 values discussed above, replace one value by a large number, say, “12”
by “1000” . Then compute the mean of the 5 values, and the 20% - trimmed mean. The replacement
does not affect the trimmed mean (because the extreme value is discarded on step 2), but it changes
the mean significantly – from 10 to 207.
7|Page
The trimmed mean, as a family of measures, includes the arithmetic mean and the median as the
most extreme case. The trimmed mean with the minimal degree of trimming (α = 0%) coincide
with the mean; the trimmed mean with the maximal degree of trimming (α = 50%) coincide with
the median.
One popular example of trimmed mean is judges scores in gymnastic, where the extreme scores
the underlying distribution is systematic, the truncated mean of a sample is unlikely to produce an
unbiased estimator for either the mean or median.
Examples
The scoring method used in many sports that are evaluated by a panel of judges is a truncated
mean: discard the lowest and highest scores; calculate the mean value of the remaining scores.
The interquartile mean is another example when the lowest 25% and the highest 25% are discarded,
and the mean of the remaining scores are calculated.
Assignments
1. The distribution below shows the life – hours of some high powered electric bulbs
measured in hundreds of hours. Compute mean
Class Interval No of tubes (f)
1–5 5
6 – 10 15
11 – 15 18
16 – 20 25
21 – 25 25
26 – 30 9
31 – 35 15
36 – 40 3
Total 120
2. The number of cars crossing a certain bridge in a big city in intervals of five minutes each
were recorded as follows: 20, 15, 16, 30, 20, 20, 12, 9, 18, 15.
Calculate the arithmetic mean and trimmed mean. Comment on your results.
3. The following data represent the ages (in years) of people living in a housing estate in
Abeokuta.
18, 31, 30, 6, 16, 17, 18, 43, 2, 8, 32, 33, 9, 18, 33, 19, 21, 13, 13, 14, 14, 6, 52, 45, 61, 23, 26, 15,
14, 15, 14, 27, 36, 19, 37, 11, 12, 11, 20, 12, 39, 20, 40, 69, 63, 29, 64, 27, 15, 28.
(i) Present the above data in a frequency table showing the following columns; class interval,
class boundary, class mark (mid-point), tally, frequency and cumulative frequency in
that order.
(ii) Calculate the mean of the distribution using the assumed mean (let 𝒙̅𝒂 = 𝟑𝟔. 𝟓)
8|Page