KEMBAR78
Probability Unit | PDF | Normal Distribution | Probability Distribution
0% found this document useful (0 votes)
40 views21 pages

Probability Unit

This chapter introduces the normal distribution, focusing on its properties and applications in modeling continuous random variables. Key concepts include the characteristics of normal curves, the relationship between mean, median, and mode, and the use of probability density functions (PDFs) to represent data distributions. The chapter also discusses the approximation of binomial distributions using normal distributions with continuity correction.

Uploaded by

eprimoch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views21 pages

Probability Unit

This chapter introduces the normal distribution, focusing on its properties and applications in modeling continuous random variables. Key concepts include the characteristics of normal curves, the relationship between mean, median, and mode, and the use of probability density functions (PDFs) to represent data distributions. The chapter also discusses the approximation of binomial distributions using normal distributions with continuity correction.

Uploaded by

eprimoch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

187

Chapter 8
The normal distribution
In this chapter you will learn how to:

■ sketch normal curves to illustrate distributions or probabilities


■ use a normal distribution to model a continuous random variable and use normal
distribution tables
■ solve problems concerning a normally distributed variable
■ recognise conditions under which the normal distribution can be used as an approximation to
the binomial distribution, and use this approximation, with a continuity correction, in solving
problems.
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

PREREQUISITE KNOWLEDGE

Where it comes from What you should be able to do Check your skills
Chapter 7 Find and calculate with the 1 Given X ~ B(45, 0.52), find E( X ) and
expectation and variance of a Var(X ).
binomial distribution.
2 Given that X follows a binomial
distribution with E( X ) = 11.2 and
Var( X ) = 7.28, find the parameters of
the distribution of X .

Why are errors quite normal?


If you study any of the sciences, you will be required at some time to measure a quantity
as part of an experiment. That quantity could be a measurement of time, mass, distance,
volume and so on. Whatever it is, any measurement you make of a continuous quantity
such as these will be subject to error. The very nature of continuous quantities means that
they cannot be measured precisely and, no matter how hard we try, inaccuracy is also
likely because our tools lack perfect calibration and we, as human beings, add in a certain
amount of unreliability.

However, small errors are more likely than large errors and our measurements are usually
just as likely to be underestimates as overestimates. When repeated measurements are
188
taken, errors are likely to cancel each other out, so the average error is close to zero and the
average of the measurements is virtually error-free.

This chapter serves as an introduction to the idea of a continuous random variable and the
method used to display its probability distribution. We will later focus our attention on one
particular type of continuous random variable, namely a normal random variable.

The normal distribution was discovered in the late 18th century by the German
mathematician Carl Friedrich Gauss through research into the measurement errors made
in astronomical observations. Some key properties of the normal distribution are that
values close to the average are most likely; the further values are from the average, the less
likely they are to occur, and the distribution is symmetrical about the average.

8.1 Continuous random variables


A continuous random variable is a quantity that is liable to change and whose infinite
number of possible values are the numerical outcomes of a random phenomenon.
Examples include the amount of sugar in an orange, the time required to run a marathon,
measurements of height and temperature and so on. A continuous random variable is not
defined for specific values. Instead, it is defined over an interval of values.

Consider the mass of an apple, denoted by X grams. Within the range of possible masses,
X can take any value, such as 111.2233…, or 137.8642…, or 145.2897…, or …. The
probability that X takes a particular value is necessarily equal to 0, since the number of
values that it can take is infinite. However, there will be a countable number of values in
any chosen interval, such as 130 < X < 140, so a probability for each and every interval can
be found.

The probability distribution of a discrete random variable shows its specific values and
their probabilities, as we saw in Chapters 6 and 7.
Chapter 8: The normal distribution

The probability distribution of a continuous random variable shows its range of values and
the probabilities for intervals within that range.

● When X is a discrete random variable, we can represent P( X = r ).


● When X is a continuous random variable, we can represent P( a ø X ø b ).
Before looking at probability distributions for continuous random variables in detail, we
will consider how we can represent the probability distribution of a set of collected or
observed continuous data.

Representation of a probability distribution


A set of continuous data can be illustrated in a histogram, where column areas are
proportional to frequencies. To illustrate the probability distribution of a set of data, we REWIND
draw a graph that is based on the shape of a histogram, as we now describe. We saw how to display
continuous data
If we change the frequency density values on the vertical axis to relative frequency density
in a histogram in
values (relative frequency density = relative frequency ÷ class width) then column areas
Chapter 1, Section 1.3.
will represent relative frequencies, which are estimates of probabilities. The vertical axis of
the diagram can now be labeled ‘probability density’.

For equal-width class intervals, the process described above has no effect on the ‘shape’ of
the diagram. The result is that the total area of the columns changes from ‘ Σ f ’ to 1, which
is the sum of the probabilities of all the possible values.

So we can draw a curved graph over the columns of an equal-width interval histogram
(preferably one displaying large amounts of data with many classes) to model the 189
probability distribution of a set of continuous data.

In the case of a random variable, such a curved graph represents a function, y = f( x ),


and is called a probability density function, abbreviated to PDF or pdf. The area under the
graph of the PDF is also equal to 1.
TIP
A curved graph is sketched over each of the histograms in the diagram below.
The word function
y area under curve = 1 y area under curve = 1 should only be used
when referring to a
Probability density

Probability density

y = f(x) random variable. For


y = f(x) data, we should rather
use curve and/or graph.

x x

If you were asked to describe these two curves, you may be tempted to say that the curve
on the right is ‘a bit odd’ and that the curve on the left is ‘a bit more normal’… and you
would be quite right in doing so, as you will see shortly.
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

Three commonly occurring types of curved graph are shown in the diagrams below.
TIP

negatively skewed symmetric positively skewed The mode is located at


the graph’s peak. The
median is at the value
where the area under
the graph is divided
into two equal parts;
longer tail to the left even tails longer tail to the right
this value can be found
by calculation from the
histogram or estimated
EXPLORE 8.1
from a cumulative
frequency graph.
Three frequency distributions are shown in the tables below.

Use a histogram to sketch a graph representing the probability distribution for each
of w, x and y.

w 3øw < 6 6øw < 9 9 ø w < 12 12 ø w < 15 15 ø w < 18 18 ø w < 21 21 ø w < 24


f 13 13 13 13 13 13 13

x 3øx < 6 6øx < 9 9 ø x < 12 12 ø x < 15 15 ø x < 18 18 ø x < 21 21 ø x < 24


f 3 9 18 24 18 9 3

y 3ø y < 6 6ø y < 9 9 ø y < 12 12 ø y < 15 15 ø y < 18 18 ø y < 21 21 ø y < 24


190 f 8 19 10 4 10 19 8

Discuss and describe the shapes of the three graphs. What feature do they have in
common?

Compare the measures of central tendency (averages) for w, x and y.

The normal curve


The frequency distribution of x in the Explore 8.1 activity produces a special type of
curved graph. It is a symmetric, bell-shaped curve, known as a normal curve.

If a probability distribution is represented by a normal curve, then:

● Mean = median = mode


● The peak of the curve is at the mean ( µ), and this is where we find the curve’s line of
symmetry
● Probability density decreases as we move away from the mean on both sides, so the
further the values are from the mean, the less likely they are to occur
● An increase in the standard deviation (σ ) means that values become more spread out
from the mean. This results in the curve’s width increasing and its height decreasing, so
that the area under the graph is kept at a constant value of 1.

Graphs that represent probability distributions of related sets of data, such as the heights
of the boys and the heights of the girls at your school, can be represented on the same
diagram, so that comparisons can be made.

The following diagram shows two pairs of normal curves with their means and standard
deviations compared. Note that the areas under the graphs in each pair are equal.
Chapter 8: The normal distribution

B
Probability density

Probability density
X Y

μA = μB μX < μY
σA > σB σX = σ Y

As we can see, A and B have the same mean, but the shapes of the normal curves are
different because they do not have the same standard deviation. Curve B is obtained from
curve A by stretching it both vertically (from the horizontal axis) and horizontally (from
the line of symmetry).

X and Y have identically-shaped normal curves because they have the same standard
deviation, but their positions or locations are different because they have different means.
Each curve can be obtained from the other by a horizontal translation.

EXPLORE 8.2
191

You can investigate the effect of altering the mean and/or standard deviation on
the location and shape of a normal curve by visiting the Density Curve of Normal
Distribution resource on the GeoGebra website.

Note that the area under the curve is always equal to 1, whatever the values of µ and σ .

EXERCISE 8A

1 The probability distributions for A and B are


represented in the diagram.
Probability density

A
Indicate whether each of the following statements is
true or false.
B
a µΑ > µΒ
b σΑ < σB
c A and B have the same range of values.
d σ Α2 = σ Β2
e At least half of the values in B are greater than µΑ.
f At most half of the values in A are less than µΒ.
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

2 The diagram shows normal curves for the probability


distributions of P and Q, that each contain n values.

Probability density
a Write down a statement comparing:
i σ P and σ Q P Q
ii the median value for P and the median value
for Q
iii the interquartile range for P and the
interquartile range for Q.
b The datasets P and Q are merged to form a new dataset denoted by W .
i Describe the range of W .
ii Is the probability distribution for W a normal curve? Explain your answer.
iii Copy the diagram above and sketch onto it a curved graph representing the probability distribution
for W . Mark the relative positions of µP, µ Q and µW along the horizontal axis.

3 The distributions of the heights of 1000 women and


of 1000 men both produce normal curves, as shown.
Probability density
The mean height of the women is 160 cm and the women men
mean height of the men is 180 cm.
The heights of these women and men are now
combined to form a new set of data. Assuming
that the combined heights also produce a normal
192
curve, copy the graph opposite and sketch onto
160 180
it the curve for the combined heights of the
Height (cm)
2000 women and men.

4 Probability distributions for the quantity of apple


juice in 500 apple juice tins and for the quantity of
Probability density

peach juice in 500 peach juice tins are both apple juice
represented by normal curves.
The mean quantity of apple juice is 340 ml with
variance 4 ml2, and the mean quantity of peach juice
is 340 ml with standard deviation 4 ml.
a Copy the diagram and sketch onto it the normal 340
curve for the quantity of peach juice in the peach Volume (ml)
juice tins.
b Describe the curves’ differences and similarities.

5 The masses of 444 newborn babies in the USA and 888 newborn babies in the UK both produce
normal curves. For the USA babies, µ = 3.4 kg and σ = 200 g; for the UK babies, µ = 3.3 kg and
σ 2 = 36100 g2.
a On a single diagram, sketch and label these two normal curves.
b Describe the curves’ differences and similarities.
Chapter 8: The normal distribution

6 The values in two datasets, whose probability distributions are both normal curves, are summarised by the
following totals:
Σx 2 = 35 000, Σx = 12 000 and n = 5000 .
Σy2 = 72 000, Σy = 26 000 and n = 10 000.
a Show that the centre of the curve for y is located to the right of the curve for x.
b On the same diagram, sketch a normal curve for each dataset.

8.2 The normal distribution


In Section 8.1, we saw how a curved graph can be used to represent the probability
distribution of a set of continuous data. A curved graph that represents the probability
distribution of a continuous random variable, as stated previously, is called a probability
density function or PDF.

If we collect data on, say, the masses of a randomly selected sample of 1000 pineapples, we
can produce a curved graph to illustrate the probabilities for the full and limited range of
these masses. If there are no pineapples with masses under 0.2 kg or over 6 kg, then our
graph will indicate that P(mass < 0.2) = 0 and P(mass > 6) = 0.

However, the continuous random variable ‘the possible mass of a pineapple’ is a theoretical
model for the probability distribution. In the model, masses of less than 0.2 kg and
masses of more than 6 kg would be shown to be extremely unlikely, but not impossible.
The continuous random variable would, therefore, indicate that P(mass < 0.2) > 0 and
P(mass > 6) > 0. 193

[Incidentally, the greatest ever recorded mass of a pineapple is 8.28 kg!]

The probability distribution of a continuous random variable is a mathematical function


that provides a method of determining probabilities for the occurrence of different
outcomes or observations.

If the random variable X is normally distributed with mean µ and variance σ 2 , then its
TIP
equation is

{ }
1 exp − ( x − µ ) , for all real values of x. { }
2
f( x ) = exp means the
σ 2π 2σ 2 number e = 2.71828 …,
The parameters that define a normally distributed random variable are its mean µ and its raised to the power in
variance σ 2. the bracket, and e p > 0
for any power p.
To describe the normally distributed random variable X , we write X ~ N µ, σ 2 . ( )
TIP
KEY POINT 8.1
The area under any
( )
X ~ N µ, σ 2 describes a normally distributed random variable. part of the curve is
the same, whether
We read this as ‘X has a normal distribution with mean µ and variance σ 2’
or not the boundary
values are included.
P(3 ø X ø 7),
The probability that X takes a value between a and b is equal to the area under the curve P(3 ø X < 7),
between the x-axis and the boundary lines x = a and x = b. P(3 < X ø 7) and
b
P(3 < X < 7) are
The area under the graph of y = f( x ) can be found by integration: P( a ø X ø b ) =
∫ f(x ) dx
a
indistinguishable.
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

Unfortunately, it is not possible to perform this integration accurately but, as we will see
later, mathematicians have found ways to handle this challenge.

Normal distributions have many interesting properties, some of which are detailed in the
following table.

Properties Probabilities TIP


Half of the values are less than the mean. P( X < µ ) = P( X ø µ ) = 0.5
The probability that
Half of the values are greater than the mean. P( X > µ) = P( X ù µ) = 0.5
the values in a normal
Approximately 68.26% of the values lie within P( µ – σ ø X ø µ + σ ) = 0.6826 distribution lie within
1 standard deviation of the mean. a certain number of
Approximately 95.44% of the values lie within P( µ – 2σ ø X ø µ + 2σ ) = 0.9544 standard deviations of
2 standard deviations of the mean. the mean is fixed.

Approximately 99.72% of the values lie within P( µ – 3σ ø X ø µ + 3σ ) = 0.9972


3 standard deviations of the mean.

In the following diagrams, the values 0, ±1 and ±2 represent numbers of standard


deviations from the mean.

0.6826 0.8413 0.9544 0.9772

194 –1 0 1 0 1 –2 0 2 –2 0
μ – σ μ μ +σ μ μ +σ μ – 2σ μ μ + 2σ μ – 2σ μ

We can use the curve’s symmetry, along with the table and diagrams above, to find other
probabilities, such as:

We know that P(–1 ø X ø 1) = 0.6826, so

( )
P( X ø 1) = 1 × 0.6826 + 0.5 = 0.8413 = P( X ù − 1) .
2

We know that P(–2 ø X ø 2) = 0.9544, so

( )
P( X ø 2) = 1 × 0.9544 + 0.5 = 0.9772 = P( X ù − 2) .
2

EXPLORE 8.3

Calculated estimates of the mean and variance of the continuous random variables
A, B and C are given in the following table.

A B C
Mean 40 72 123
Variance 64 144 121
Chapter 8: The normal distribution

Random observations from each distribution were made with the following results:

For A: 8060 out of 13 120 observations lie in the interval from 32 to 48.

For B: 8475 out of 12 420 observations lie in the interval from 60 to 84.

For C: 8013 out of 10 974 observations lie in the interval from 112 to 134.

Investigate this information (using the previous table showing properties and
probabilities for normal distributions) and comment on the statement ‘The
distributions of A, B and C are all normal’.

The standard normal variable Z FAST FORWARD


There are clearly an infinite number of values for the parameters of a normally distributed
random variable. Nevertheless, most problems can be solved by transforming the random Later in this section,
we will see how any
variable into a standard normal variable, which is denoted by Z, and which has mean 0 and
normal variable can
variance 1.
be transformed to
By substituting µ = 0 and σ 2 = 1 into the equation for the normal distribution PDF, we the standard normal
variable by coding.
can find the equation of the PDF for Z ~ N(0, 1). This is denoted by φ( z ) and its equation

2π { 2
2
}
is φ( z ) = 1 exp − z . The graph of y = φ( z ) is shown below.

KEY POINT 8.2

The standard normal


0.4 195
variable is Z ~ N (0, 1)

y = ϕ(z)
0.2
TIP

φ and Φ are the lower


and upper-case Greek
–3 –2 –1 0 1 2 3 z
letter phi.

The mean of Z is 0.

The axis of symmetry is a vertical line through the mean, as with every normal
distribution.

Z has a variance of 1 and, therefore, a standard deviation of 1.

z = ±1, ±2 and ±3 represent values that are 1, 2 and 3 standard deviations above or below
the mean.

Any z < 0 represents a value that is less the mean.

Any z > 0 represents a value that is greater the mean.

For z > 3 and for z < −3, φ( z ) ≈ 0.

The area under the graph of y = φ( z ) is equal to 1.

A vertical line drawn at any value of Z divides the area under the curve into two parts: one
representing P( Z ø z ) and the other representing P( Z > z ) .

The value of P( Z ø z ) is denoted by Φ( z ) and, as mentioned earlier, we do not find such


values by integration. Tables showing the value of Φ( z ) for different values of z have been
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

compiled and appear in the Standard normal distribution function table at the end of the
book. In addition, some modern calculators are able to give the value of Φ( z ) and the
inverse function Φ −1( z ) directly.

Although only zero and positive values of Z (i.e. z ù 0) appear in the tables, the graph’s
symmetry allows us to use the tables for positive and for negative values of z, as you will
see after Worked example 8.2.

Values of the standard normal variable appear as 4-figure numbers from z = 0.000 to
z = 2.999 in the tables. The first and second figures of z appear in the left-hand column;
the third and fourth figures appear in the top row. The numbers in the ‘ADD’ column for
TIP
the fourth figures indicate what we should add to the value of Φ( z ) in the body of the table.
Critical values refer to
Φ( z ) can be found for any given value of z, and z can be found for any given value of Φ( z ) probabilities of 75%,
by using the tables in reverse (as shown in Worked example 8.4). In the critical values table, 90%, 95%, … and their
values for Φ( z ) are denoted by p. complements 25%, 10%,
5%, … and so on.
A section of the tables, from which we will find the value of Φ(0.274), is shown below.

First and second figures Third figure Fourth figure

z 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
ADD
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359 4 8 12 16 20 24 28 32 36
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 4 8 12 16 20 24 28 32 36
0.2 0.5793 0.5832 0.5871 0.5910 0.5949 0.5987 0.6026 0.6064 0.6103 0.6141 4 8 12 15 19 23 27 31 35
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517 4 7 11 15 19 22 26 30 34
196
We locate the first and second figures of z (namely 0.2) in the left-hand column. TIP
We then locate the third figure of z (namely 7) along the top row… this tells us that
Φ(0.274) = 0.6079
Φ(0.27) = 0.6064.
can be expressed using
Next we locate the fourth figure of z (namely 4) at the top-right. In line with 0.6064, we see inverse notation as
‘ADD 15’, which means that we must add 15 to the last two figures of 0.6064 to obtain the Φ –1 (0.6079) = 0.274.
value of Φ(0.274).

Φ(0.274) = 0.6064 + 0.0015 = 0.6079

WORKED EXAMPLE 8.1

Given that Z ~ N(0, 1), find P(Z < 1.23) and P( Z ù 1.23) .

Answer

Φ(1.23) = 0.8907 is the area to the


0.8907 1 – 0.8907 left of z = 1.23.
= 0.1093
1 – Φ(1.23) = 0.1093 is the area to
the right of z = 1.23, as shown in
the graphs.

0 1.23 z 0 1.23 z

∴ P( Z < 1.23) = 0.8907 and P(Z ù 1.23) = 0.1093


Chapter 8: The normal distribution

WORKED EXAMPLE 8.2

Given that Z ~ N(0, 1), find P(0.4 ø Z < 1.7) correct to 3 decimal places.

Answer

The required probability is equal to


the difference between the area to
the left of z = 1.7 and the area to the
left of z = 0.4, as illustrated.

0.4 1.7 z

Φ(1.7) = 0.9554 and Φ(0.4) = 0.6554 We find the values of Φ(1.70) and
Φ(0.40) in the main body of the
table, which means that we do not
P(0.4 ø Z < 1.7) = P( Z < 1.7) – P( Z < 0.4)
need to use the ADD section.
= Φ(1.7) – Φ(0.4)
= 0.9554 – 0.6554
= 0.300

197
As noted previously, the normal distribution function tables do not show values for z < 0 .
However, we can use the symmetry properties of the normal curve, and the fact that the
area under the curve is equal to 1, to find values of Φ( z ) when z is negative.

Situations in which z > 0 , and in which z < 0 , are illustrated in the two diagrams below.

For a positive value, z = b: For a negative value, z = – a :

The shaded area in this graph represents the The shaded area in this graph represents the value
value of Φ( b ). of Φ( a ) .

Φ( b) = P(Z > b) Φ(a) = P(Z > –a)

z –a 0 z
0 b

Φ( b ) = P(Z ø b ) and 1 – Φ( b ) = P( Z ù b ) . Φ( a ) = P( Z ù –a ) and 1 – Φ( a ) = P( Z ø –a ).

From the tables, the one piece of information, Φ(0.11) = 0.5438, actually tells us four
probabilities:

P( Z ø 0.11) = 0.5438 and P(Z ù –0.11) = 0.5438.

P( Z ù 0.11) = 1 – 0.5438 = 0.4562 and P( Z ø –0.11) = 1 – 0.5438 = 0.4562.


Cambridge International AS & A Level Mathematics: Probability & Statistics 1

Information given about probabilities in a normal distribution should always be


transferred to a sketched graph. Useful information, such as whether a particular value of
z is positive or negative, will then be easy to see. This could, of course, also be determined
by considering inequalities.

If, for example, P(Z ù z ) > 0.5, then P(Z ø z ) < 0.5 and, therefore, z < 0 .

WORKED EXAMPLE 8.3

Given that Z ~ N(0, 1), find P(–1 ø Z < 2.115) correct to 3 significant figures.

Answer

The required probability is given by the


difference between the area to the left of
z = 2.115 and the area to the left of z = –1.

The first of these areas is greater than 0.5


and the second is less than 0.5.

–1 0 2.115 z

P( Z < 2.115) = Φ(2.115) and P( Z < –1) = 1 – Φ(1).


P(–1 ø Z < 2.115) = Φ(2.115) – [1 – Φ(1)]
198 = Φ(2.115) + Φ(1) – 1
= 0.9828 + 0.8413 – 1
= 0.824

WORKED EXAMPLE 8.4

Given that Z ~ N(0, 1), find the value of a such that P( Z < a ) = 0.9072.

Answer

0.9066 = Φ(1.32) To find a value of z, we use the tables in reverse and search for the Φ( z )
value closest to 0.9072, which is 0.9066.

a = 1.32 + 0.004 For our value of 0.9072, we need to add 0.0006 to 0.9066, so ‘ADD 6’ is
= 1.324 required – this will be done if 1.32 is given a 4th figure of 4.

a = Φ −1(0.9072) We can check the value obtained for a by reading the tables in the
−1
= Φ (0.9066 + 0.0006) usual way.
Φ(1.324) = Φ(1.32) + ‘ADD 6’
= Φ −1(0.9066) + 0.004
= 0.9066 + 0.0006
= 1.32 + 0.004
= 0.9072
= 1.324
Alternatively, we can show our working using inverse notation.
Chapter 8: The normal distribution

WORKED EXAMPLE 8.5

Given that Z ~ N(0, 1), find the value of b such that P(Z ≥ b ) = 0.7713.

Answer

P( Z ù b ) > 0.5 tells us that b is


0.7713 0.7713 negative, so on our diagram we can
replace b by – a, resulting in the two
situations shown.

z z
–a 0 0 a

P( Z ø a ) = 0.7713, so a = Φ −1(0.7713) The value closest to 0.7713 in the


tables is 0.7704. This requires the
= Φ −1(0.7704 + 0.0009)
addition of 0.0009 to bring it up
= Φ −1(0.7704) + 0.003 to 0.7713, and 9 is in the column
= 0.740 + 0.003 headed ‘ADD 3’.
= 0.743
∴ b = –a = –0.743

199
EXERCISE 8B

1 Given that Z ~ N(0, 1), find the following probabilities correct to 3 significant figures.
a P(Z < 0.567) b P(Z ø 2.468) c P( Z > –1.53) d P( Z ù –0.077)
e P( Z > 0.817) f P( Z ù 2.009) g P( Z < –1.75) h P( Z ø –0.013)
i P( Z < 1.96) j P( Z > 2.576)

2 The random variable Z is normally distributed with mean 0 and variance 1. Find the following probabilities,
correct to 3 significant figures.
a P(1.5 < Z < 2.5) b P(0.046 < Z < 1.272) c P(1.645 < Z < 2.326)
d P(–2.807 < Z < –1.282) e P(–1.777 < Z < –0.746) f P( –1.008 < Z < –0.337)
g P(–1.2 < Z < 1.2) h P(–1.667 < Z < 2.667) i (
P −3 ø Z < 8
4 5 )
j P( 2 ø Z < 5 )
3 Given that Z ~ N(0, 1), find the value of k, given that:
a P( Z < k ) = 0.9087 b P( Z < k ) = 0.5442 c P( Z > k ) = 0.2743 d P( Z > k ) = 0.0298
e P( Z < k ) = 0.25 f P( Z < k ) = 0.3552 g P( Z > k ) = 0.9296 h P( Z > k ) = 0.648
i P(– k < Z < k ) = 0.9128 j P(– k < Z < k ) = 0.6994
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

4 Find the value of c in each of the following where Z has a normal distribution with µ = 0 and σ 2 = 1.
a P( c < Z < 1.638) = 0.2673 b P( c < Z < 2.878) = 0.4968
c P(1 < Z < c ) = 0.1408 d P(0.109 < Z < c ) = 0.35
e P( c < Z < 2) = 0.6687 f P( c < Z < 1.85) = 0.9516
g P(–1.221 < Z < c ) = 0.888 h P(–0.674 < Z < c ) = 0.725
i P(–2.63 < Z < c ) = 0.6861 j P(–2.7 < Z < c ) = 0.0252

Standardising a normal distribution REWIND


The probability distribution of a normally distributed random variable is represented by a
normal curve. This curve is centred on the mean µ ; the area under the curve is equal to 1, In Chapter 2, Section
and its height is determined by the standard deviation σ . 2.2 and in Chapter 3,
Section 3.3, we saw
We already have a method for finding probabilities involving the standard normal variable how the coding of data
Z ~ N(0, 1) using the normal distribution function tables. Fortunately, this same set of by addition and/or
tables can be used to find probabilities involving any normal random variable, no matter multiplication affects
what the values of µ and σ 2. Although we have only learnt about coding data, it turns out the mean and the
standard deviation.
that coding works in exactly the same way for normally distributed random variables: they
behave in the way that we expect and remain normal after coding.
FAST FORWARD
If we code X by subtracting µ, then the PDF is translated horizontally by – µ units and is
now centred on 0. The new random variable X – µ has mean 0 and standard deviation σ . We will learn more
200 1 about coding
If we now code X – µ by multiplying by σ (i.e. dividing by σ ) then the standard random variables
deviation (and variance) will be equal to 1, while the mean remains 0. in the Probability &
The coded random variable X − µ is normally distributed with mean 0 and variance 1.
Statistics 2 Coursebook,
σ Chapter 3.
Coding the random variable X in this way is called standardising, because it transforms the
distribution X ~ N( µ, σ 2 ) to Z ~ N(0, 1). REWIND

In the table showing


KEY POINT 8.3
properties and
probabilities of normal
X −µ
When X ~ N( µ, σ 2 ) then Z = has a standard normal distribution. distributions prior to
σ
x−µ Explore 8.3, we saw
A standardised value z = tells us how many standard deviations x is from the mean.
σ that probabilities are
determined by the
number of standard
Probabilities involving values of X are equal to probabilities involving the corresponding deviations from the
values of Z, which can be found from the normal distribution function tables for mean. The properties
Z ~ N(0, 1). given in that table
apply to all normal
 23 − 20 
For example, if X ~ N(20, 9), then P( X < 23) = P  Z < . random variables.
 9 
Chapter 8: The normal distribution

WORKED EXAMPLE 8.6

Given that X ~ N(11, 25), find P(X < 18) correct to 3 significant figures.

Answer

18 − 11
z= = 1.4 We standardise x = 18 and find that it is 1.4σ above the mean of 11.
25
P( X < 18) = P( Z < 1.4)
= Φ(1.4)
= 0.919

WORKED EXAMPLE 8.7

Given that X ~ N(20, 7), find P(X < 16.6) correct to 3 significant figures.

Answer

16.6 − 20 We standardise x = 16.6 and find that it is 1.285σ


z= = –1.285
7 below the mean of 20.
P( X ø 16.6) = P( Z ø – 1.285)
= 1 – Φ(1.285) 201
= 0.0994

WORKED EXAMPLE 8.8

Given that X ~ N(5, 5), find P(2 ø X < 9) correct to 3 significant figures.

Answer

2−5
For x = 2, z = = –1.342
5
9−5
For x = 9, z = = 1.789
5
The required area is shown in
two parts in the diagram.

2 5 9 x
–1.342 0 1.789 z
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

TIP
Area to the right of z = 0 is Here, we find the two areas
Φ(1.789) – Φ(0) = Φ(1.789) − 0.5 separately then add them to Where possible, always
use a 4-figure value
= 0.4633 obtain our final answer, which is
for z.
where we round to the accuracy
Area to the left of z = 0 is specified in the question.
Φ(0) – Φ(–1.342) = 0.5 – [ 1 – Φ(1.342) ]
= 0.4102
Total area = 0.4633 + 0.4102 = 0.8735 Try solving this problem by
∴ P(2 < X < 9) = 0.874 the method shown in Worked
example 8.3 (using the areas to
the left of both z = 1.789 and
z = –1.342) and decide which
approach you prefer.

Some useful results from previous worked examples are detailed in the following graphs.

For 0 < a < b For – a < 0 < b For – a < 0 < a

202
0 a b z –a 0 b z –a 0 a z

P( a < Z < b ) = Φ( b ) – Φ( a ) P(– a < Z < b ) = Φ( b ) + Φ( a ) – 1 P(– a < Z < a ) = 2 Φ( a ) – 1

WORKED EXAMPLE 8.9

Given that Y ~ N( µ, σ 2 ), P(Y < 10) = 0.75 and P(Y ù 12) = 0.1, find the values of µ and σ .

Answer

P(Y ù 12) < 0.5, so P(Y < 12) > 0.5, which means that 12 > µ .

P(Y < 10) > 0.5, which means also that 10 > µ .

0.75 0.90 These simple sketch graphs allow


us to locate the values 10 and 12
relative to µ.

μ 10 y μ 12 y
0 za z 0 zb z

10 − µ za = Φ –1(0.75) = 0.674 and


= 0.674 gives 10 – µ = 0.674σ …… [1]
σ zb = Φ –1(0.90) = 1.282
12 − µ
= 1.282 gives 12 – µ = 1.282σ …… [2] Note that both 0.75 and 0.90 are
σ
critical values.
Chapter 8: The normal distribution

12 – µ = 1.282σ [2] We subtract equation [ 1 ] from [ 2 ]


10 – µ = 0.674σ [1] to solve this pair of simultaneous
2 = 0.608σ equations.
∴ σ = 3.29 and µ = 7.78.

EXERCISE 8C

1 Standardise the appropriate value(s) of the normal variable X represented in each diagram, and find the
required probabilities correct to 3 significant figures.
a Find P(X ø 11), given that X ~ N(8, 25).

8 11 x
0 ...... z

b Find P( X < 69.1), given that X ~ N(72, 11).


203

69.1 72 x
....... 0 z

c Find P(3 < X < 7), given that X ~ N(5, 5).

3 5 7 x
.... 0 .... z

From this point in the exercise, you are strongly advised to sketch a diagram to help answer each question.
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

2 Calculate the required probabilities correct to 3 significant figures.


a Find P( X ø 9.7) and P( X > 9.7), given that X ~ N(6.2, 6.25).
b Find P( X ø 5) and P( X > 5), given that X ~ N(3, 49).
c Find P( X > 33.4) and P( X ø 33.4), given that X ~ N(37, 4).
d Find P( X < 13.5) and P( X ù 13.5) , given that X ~ N(20, 15).
e Find P( X > 91) and P( X ø 91), given that X ~ N(80, 375).
f Find P(1 ø X < 21), given that X ~ N(11, 25).
g Find P(2 ø X < 5), given that X ~ N(3, 7).
h Find P(6.2 ù X ù 8.8), given that X ~ N(7, 1.44). [Read carefully.]
i Find P(26 ø X < 28), given that X ~ N(25, 6).
j Find P(8 ø X < 10), given that X ~ N(12, 2.56).

3 a Find a, given that X ~ N(30, 16) and that P( X < a ) = 0.8944.


b Find b, given that X ~ N(12, 4) and that P(X < b ) = 0.9599.
c Find c, given that X ~ N(23, 9) and that P( X > c ) = 0.9332.
d Find d , given that X ~ N(17, 25) and that P( X > d ) = 0.0951.
e Find e, given that X ~ N(100, 64) and that P( X > e ) = 0.95.

4 a Find f , given that X ~ N(10, 7) and that P( f ø X < 13.3) = 0.1922.


204
b Find g, given that X ~ N(45, 50) and that P( g ø X < 55) = 0.5486.
c Find h, given that X ~ N(7, 2) and that P(8 ø X < h ) = 0.216.
d Find j, given that X ~ N(20, 11) and that P( j ø X < 22) = 0.5.

5 X is normally distributed with mean 4 and variance 6. Find the probability that X takes a negative value.

( )
6 Given that X ~ N µ, 4 µ 2 where µ > 0, find P( X < 2 µ).
9
7 If T ~ N(10, σ 2 ) and P(T > 14.7) = 0.04, find the value of σ .

8 It is given that V ~ N( µ, 13) and P(V < 15) = 0.75. Find the value of µ.

9 The variable W ~ N( µ, σ 2 ). Given that µ = 4σ and P(W < 83) = 0.95, find the value of µ and of σ .

10 X has a normal distribution in which σ = µ – 30 and P( X ù 12) = 0.9. Find the value of µ and of σ .

11 The variable Q ~ N( µ, σ 2 ). Given that P(Q < 1.288) = 0.281 and P(Q < 6.472) = 0.591, find the value of µ
and of σ , and calculate the P(4 ø Q < 5).

12 For the variable V ~ N(µ, σ 2 ), it is given that P(V < 8.4) = 0.7509 and P(V > 9.2) = 0.1385 .
Find the value of µ and of σ , and calculate P(V ø 10).

13 Find the value of µ and of σ and calculate P(W > 6.48) for the variable W ~ N(µ, σ 2 ), given that
P(W ù 4.75) = 0.6858 and P(W ø 2.25) = 0.0489.

14 X has a normal distribution, such that P( X > 147.0) = 0.0136 and P( X ø 59.0) = 0.0038.
Use this information to calculate the probability that 80.0 ø X < 130.0.
Chapter 8: The normal distribution

8.3 Modelling with the normal distribution


The German mathematician Carl Friedrich Gauss showed that measurement errors made
in astronomical observations were well modelled by a normal distribution, and the Belgian
statistician and sociologist Adolphe Quételet later applied this to human characteristics
when he saw that distributions of such things as height, weight, girth and strength were
approximately normal.

We are now in a position to apply our knowledge to real-life situations, and to solve more
advanced problems involving the normal distribution.

WORKED EXAMPLE 8.10

The mass of a newborn baby in a certain region is normally distributed with mean
3.35 kg and variance 0.0858 kg2. Estimate how many of the 1356 babies born last year
had masses of less than 3.5 kg.

Answer TIP

We cannot know
 3.5 − 3.35 
Φ( Z ) = Φ 
 0.0858 
We standardise the mass of 3.5 kg. the exact number of
newborn babies from
= Φ(0.512) 0.6957 is a relative frequency equal the model because it
= 0.6957 to 69.57% only gives estimates.
However, we do know
P(mass < 3.5 kg) = 0.6957 that the number of
69.57% of 1356 = 943.3692 babies must be an 205
∴ There were about 943 newborn babies. integer.

WORKED EXAMPLE 8.11

A factory produces half-litre tins of oil. The volume of oil in a tin is normally
distributed with mean 506.18 ml and standard deviation 2.96 ml.

a What percentage of the tins contain less than half a litre of oil?

b Find the probability that exactly 1 out of 3 randomly selected tins contains less
than half a litre of oil.
Answer

500 − 506.18
a z= = –2.088 Let X represent the
2.96
amount of oil in a tin, then
X ~ N(506.18, 2.962 ).

The graph shows the


probability distribution for the
amount of oil in a tin.

500 506.18 x
–2.088 0 z
Cambridge International AS & A Level Mathematics: Probability & Statistics 1

P( X < 500) = P( Z < –2.088)


= 1 – Φ(2.088)
= 1 – 0.9816
= 0.0184
∴ 1.84% of the tins contain less than half a litre of oil. TIP

A probability obtained
 3 Let the discrete random
b P(Y = 1) =   × 0.01841 × 0.98162 from a normal
 1 variable Y be the number of tins distribution can be
= 0.0532 containing less than half a litre used as the parameter
of oil, then Y ~ B(3, 0.0184). p in a binomial
distribution.

EXERCISE 8D

1 The length of a bolt produced by a machine is normally distributed with mean 18.5 cm and variance 0.7 cm 2.
Find the probability that a randomly selected bolt is less than 18.85 cm long.

2 The waiting times, in minutes, for patients at a clinic are normally distributed with mean 13 and variance 16.
a Calculate the probability that a randomly selected patient has to wait for more than 16.5 minutes.
b Last month 468 patients attended the clinic. Calculate an estimate of the number who waited for less than
206 9 minutes.

3 Tomatoes from a certain producer have masses which are normally distributed with mean 90 grams and
standard deviation 17.7 grams. The tomatoes are sorted into three categories by mass, as follows:
Small: under 80 g; Medium: 80 g to 104 g; Large: over 104 g.
a Find, correct to 2 decimal places, the percentage of tomatoes in each of the three categories.
b Find the value of k such that P( k ø X < 104) = 0.75, where X is the mass of a tomato in grams.

4 The heights, in metres, of the trees in a forest are normally distributed with mean µ and standard deviation 3.6.
Given that 75% of the trees are less than 10 m high, find the value of µ.

5 The mass of a certain species of fish caught at sea is normally distributed with mean 5.73 kg and variance
2.56 kg2. Find the probability that a randomly selected fish caught at sea has a mass that is:
a less than 6.0 kg b more than 3.9 kg c between 7.0 and 8.0 kg

6 The distance that children at a large school can hop in 15 minutes is normally distributed with mean 199 m
and variance 3700 m 2.
a Calculate an estimate of b, given that only 25% of the children hopped further than b metres.
b Find an estimate of the interquartile range of the distances hopped.

7 The daily percentage change in the value of a company’s shares is expected to be normally distributed with
mean 0 and standard deviation 0.51. On how many of the next 365 working days should the company expect
the value of its shares to fall by more than 1%?

8 The masses, w grams, of a large sample of apples are normally distributed with mean 200 and variance 169.
Given that the masses of 3413 apples are in the range 187 < w < 213, calculate an estimate of the number of
apples in the sample.
Chapter 8: The normal distribution

9 The ages of the children in a gymnastics club are normally distributed with mean 15.2 years and standard
deviation σ . Find the value of σ given that 30.5% of the children are less than 13.5 years of age.

10 The speeds, in kmh –1, of vehicles passing a particular point on a rural road are normally distributed with
mean µ and standard deviation 20. Find the value of µ and find what percentage of the vehicles are being
driven at under 80 kmh –1, given that 33% of the vehicles are being driven at over 100 kmh –1.

11 Coffee beans are packed into bags by the workers on a farm, and each bag claims to contain 200 g. The actual
mass of coffee beans in a bag is normally distributed with mean 210 g and standard deviation σ . The farm
owner informs the workers that they must repack any bag containing less than 200 g of coffee beans. Find the
value of σ , given that 0.5% of the bags must be repacked.

12 Colleen exercises at home every day. The length of time she does this is normally distributed with mean 12.8
minutes and standard deviation σ . She exercises for more than 15 minutes on 42 days in a year of 365 days.
a Calculate the value of σ .
b On how many days in a year would you expect Colleen to exercise for less than 10 minutes?

13 The times taken by 15-year-olds to solve a certain puzzle are normally distributed with mean µ and standard
deviation 7.42 minutes.
a Find the value of µ , given that three-quarters of all 15-year-olds take over 20 minutes to solve the puzzle.
b Calculate an estimate of the value of n, given that 250 children in a random sample of n 15-year-olds fail to
solve the puzzle in less than 30 minutes.

14 The lengths, X cm, of the leaves of a particular species of tree are normally distributed with mean µ and variance σ 2 . 207

a Find P( µ – σ ø X < µ + σ ).
b Find the probability that a randomly selected leaf from this species has a length that is more than
2 standard deviations from the mean.
c Find the value of µ and of σ , given that P( X < 7.5) = 0.75 and P( X < 8.5) = 0.90.

15 The time taken in seconds for Ginger’s computer to open a specific large document is normally distributed
with mean 9 and variance 5.91.
a Find the probability that it takes exactly 5 seconds or more to open the document.
b Ginger opens the document on her computer on n occasions. The probability that it fails to open in less
than exactly 5 seconds on at least one occasion is greater than 0.5. Find the least possible value of n.

16 The masses of all the different pies sold at a market are normally distributed with mean 400 g and standard
deviation 61g. Find the probability that:
a the mass of a randomly selected pie is less than 425 g
b 4 randomly selected pies all have masses of less than 425 g
c exactly 7 out of 10 randomly selected pies have masses of less than 425 g.

17 The height of a female university student is normally distributed with mean 1.74 m and standard deviation
12.3 cm. Find the probability that:
a a randomly selected female student is between 1.71 and 1.80 metres tall
b 3 randomly selected female students are all between 1.71 and 1.80 m tall
c exactly 15 out of 50 randomly selected female students are between 1.71 and 1.80 metres tall.

You might also like