Lecture 05
Lecture 05
Lecture 5
Continuous Probability
Distributions
Introduction
We have learnt earlier that in statistics there are two broad categories of data –
discrete and continuous. In the last chapter we discussed the way to handle discrete
probability distributions, and some useful discrete distributions. In this chapter we
will do a similar exercise for continuous variables. We recognize one fact at the onset.
Because of the discrete nature of the variable, where needed, we performed a discrete
summation in discrete distribution. But in case of continuous variable, a discrete
summation is not possible. We will require performing a continuous summation – i.e.
integration – in case of continuous variable.
There is a second, and more important, difference between discrete and continuous
probability. Once again, because of the nature of the problem, it was quite possible for
a discrete variable to have a particular, fixed, specified value; like 3, 27, 19, or 51, etc.
As a result, we ended up finding the probability of the random variable having these
particular, fixed, specified values. The case for continuous variable is little interesting.
Suppose we are investigating the height of people. We have evaluated that the heights
range from 1.3 m to 1.7 m, with average 1.45 m. If we choose one person at random
and ask the question what is the probability that the height of the person is 1.45 m? To
answer this question, we use the classic definition of probability that we learnt in
lecture 3.
No. of ways the event can occur
p ( A) =
No. of ways the experiment can proceed
There is only one way the even can occur, i.e. is the height of the person is exactly
1.45 m. When we say 1.45 m, we mean exactly 1.45 m, with no variation. In the
denominator, there are infinite possible heights between 1.3 m and 1.7 m. Therefore,
the probability turns out to be 0! In fact this is a hallmark of continuous probability
that the probability of the random variable having a particular value is always zero.
Keeping this in mind, to understand how to handle probability of continuous variable,
let us consider a distribution of a discrete variable, as shown in figure 1(a). The
grouping of the variable is natural in this case. But if the variable is continuous, we
can imagine the distribution to be a limiting case of the discrete distribution with the
width of the bar becoming narrower and narrower. In the limit, the distribution would
be as shown in figure 1(b).
(a) (b)
Keeping both the figures in mind, we can develop a little different interpretation for
probability in case of a discrete variable that would be equally applicable for the case
of a continuous variable also. Let us assume that for any value of x = x0, p(x = x0) =
p0. This would correspond to one of the bars of the histogram. Since the width of the
bar has no physical significance, we can arbitrarily consider the width unity, and the
height p0. This means that the probability p0 is the area of the bar. In case of a
continuous variable, each number will be a valid value of the random variable; and
each number will correspond to a vertical line. We know from basic geometry that the
area of a line is 0. Therefore, once again, we establish that in case of a continuous
variable, probability of the random variable having a particular value is 0.
Property 1 and 2 are necessary and sufficient condition for the function f(x) to be a
continuous density. It is evident that the term ‘density’ in the continuous case is just
an extension of the word ‘distribution’ presented in the discrete case, with summation
replaced by integration. This is an important notion, as it will allow us to define all
other mathematical concept.
Example 6.1
Lead is added to gasoline on purpose. It acts as a catalyst to accelerate the
combustion of the fuel. But lead in the atmosphere is a major pollutant. Therefore,
environmental regulators are paying extra attention to lead content in the gasoline.
The lead concentration in gasoline currently ranges from 0.1 gms per liter to 0.5 gms
per liters linearly. If a sample of gasoline is chosen at random, what is the probability
that the lead concentration will be between 0.2 and 0.3 gms/l.
6 6
5 5
4 4
f(x)
3 3
2 2
1 (a) 1 (b)
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.1 0.2 0.3 0.4 0.5 0.6
x x
Figure 2. Example 6.1
the function between 0.1 and 0,5 in such a way so that the area under the distribution
is 1. With little bit of college algebra, we can derive the following expression for
f(x).
12.5 x − 1.25 0.1 < x < 0.5
f ( x) =
0 otherwise
We can verify the conditions (1) and (2) above with this function. The function is
always positive in the given range of existence, i.e. between x = 0.1 to 0.5. We can
also verify the condition (2) as follows.
∞ 0.1 0.5 ∞
−∞
∫ f ( x)dx = ∫
−∞
f ( x)dx + ∫
0.1
f ( x)dx + ∫ f ( x)dx
0.5
0.5
= 0+ ∫ f ( x)dx + 0
0.1
0.5
0.5
x 2
= ∫ (12.5 x − 1.25) dx = 12.5
0.1 2
− 1.25 x = 1
0.1
The details of integrations are skipped here. To find the probability that has been
asked for, we write,
0. 3
0.3
x2
p (0.2 < x < 0.3) = ∫ (12.5 x − 1.25) dx = 12.5 − 1.25 x = 0.1875
0.2 2 0. 2
This probability is the area under the function between x = 0.2 and 0.3, as shown in
figure 2(b).
We explain one point here. In the example above, we were not very critical about the
end points x = 0.2 and 0.3. While finding the probability, we never made clear
whether the end points were included or not. The reason is that it wound not have
made any difference, because we have already established that in case of continuous
probability, p(x = a) = 0. Therefore, p(a ≤ x ≤ b) = p(a ≤ x < b) = p(a
< x ≤ b) = p(a < x < b).
Cumulative distribution
Example 6.2
We will evaluate the cumulative distribution for the function given in example 6.1.
For this example the density function was defined as
12.5 x − 1.25 0.1 < x < 0.5
f ( x) =
0 otherwise
Cumulative distribution has been defined as
x
F ( x ) = p (t < x) =
−∞
∫ f (t )dt
For x < 0.1, this integral is 0, as f(t) = 0. For 0.1 < t < 0.5
x
F ( x) = ∫ (12.5x − 1.25)dx = 6.25x − 1.25 x + 0.0625
2
0.1
For x > 0.5, F(x) = 1.0. We, now compile the complete result
0.0 x < 0.1
F ( x) = 6.25 x 2 − 1.25 x + 0.0625 0.1 < x < 0.5
1.0 x > 0.5
In the previous problem, the probability p(0.2 < x < 0.3) was found by direct
integration. Once we have evaluated the cumulative function, finding the probability
becomes easy. The probability can be found as
p(0.2 < x < 0.3) = p(x < 0.3) – p(x < 0.2)
= F(0.3) – F(0.2)
By direct substitution
F(0.3) = 6.25 (0.3)2 – 1.25(0.3) + 0.0625 = 0.2500
F(0.2) = 6.25 (0.2)2 – 1.25(0.2) + 0.0625 = 0.0625
6 1.2
5 f(x) 1 F(x)
4 0.8
3 0.6
2 0.4
1 0.2
0 0
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
x x
Figure 3. Example 6.2
Therefore,
p(0.2 < x < 0.3) = 0.2500 – 0.0625 = 0.1875
Expected values
In case of a continuous density function f(x), the expected value of a function H(x) is
defined as
∞
E[ H ( x)] = ∫ H ( x) f ( x)dx
−∞
As in the discrete case the mean, variance and the moment generating function are
defined as
∞
µ = E ( x) = ∫ xf ( x)dx
−∞
[ ] ∫ (x − µ )
∞
σ = E (x − µ ) =
2 2 2
( )
f ( x )dx = E x 2 − [E (x )]
2
−∞
∞
( ) ∫e
m x (t ) = E e tx = tx
f ( x) dx
−∞
Example 6.3
We will find the mean and variance for the density function of example 6.1.
−∞ 0.1 0.1
0.5
x3 x2
= 12.5 − 1.25 = 0.3667 gms/l
3 2 0.1
To find the variance, we find E(x2).
∞ 0.5 0.5
( ) ∫x
E x2 = 2
f ( x)dx = ∫ x (12.5x − 6.25)dx =
2
∫ (12.5x
3
)
− 6.25 x 2 dx
−∞ 0.1 0.1
0.5
x4 x3
= 12.5 − 1.25 = 0.1433 gms/l
4 3 0.1
Therefore,
( )
σ 2 = E x 2 − [E (x )] = 0.1433 − 0.3667 2 = 0.00883 gms/l
2
Exercises
5.1 Suppose that f(x) = e–x for 0 < X. Determine the following probabilities:
(a) p(1 < X) (b) p(1 < X < 2.5)
(c) p(X = 3) (d) p(X < 4)
5.2 Suppose that f(x) = x/8 for 3 < x < 5. Determine the following probabilities:
(a) p(X < 4) (b) p(X > 3.5)
(c) p(4 < X < 5) (d) p(X < 4.5)
(e) p(X < 3.5 or X > 4.5)
5.3 Suppose that f(x) = e–(x – 4) for 4 < x. Determine the following probabilities:
(a) p(1 < X) (b) p(2 ≤ X < 5)
(c) p(5 < X) (d) p(8 < X < 12)
(e) Determine x such that P(X < x) = 0.90.
5.4 Suppose that f(x) = 1.5x2 for –1 < x < 1. Determine the following probabilities:
(a) p(0 < X) (b) p(0.5 < X)
(c) p(–0.5 < X < 0.5) (d) p(X < –2)
(e) p(X < 0 or X > –0.5)
(f) Determine x such that p( x < X) = 0.05.
5.6 The probability density function of the net weight in pounds of a packaged
chemical herbicide is for pounds.
(a) Determine the probability that a package weighs more than 50 pounds.
(b) How much chemical is contained in 90% of all packages?
5.8 Determine the cumulative distribution function for the distribution in Exercise
5.1.
5.9 Determine the cumulative distribution function for the distribution in Exercise
5.2.
5.10 Determine the cumulative distribution function for the distribution in Exercise
5.3.
5.11 Determine the cumulative distribution function for the distribution in Exercise
5.5. Use the cumulative distribution function to determine the probability that a
component lasts more than 3000 hours before failure.
Determine the probability density function for each of the following cumulative
distribution functions.
5.12 F ( x) = 1 − e −2 x x>0
0.0 x<0
0.2 x 0<x<4
5.13 F ( x) =
0.4 x + 0.64 4< x<9
1.0 9< x
0.0 x < −2
0.25 x + 0.5 − 2 < x <1
5.14 F ( x) =
0.5 x + 0.25 1 < x < 1.5
1.0 1.5 < x
5.15 The gap width is an important property of a magnetic recording head. In coded
units, if the width is a continuous random variable over the range from 0 < x < 2
with f(x) = 0.5x, determine the cumulative distribution function of the gap width.
5.16 Suppose f(x) = 0.125x for 0 < x < 4. Determine the mean and variance of X.
5.17 Suppose f(x) = 1.5x2 for –1 < x < 1. Determine the mean and variance of X.
5.19 Suppose that contamination particle size (in micrometers) can be modeled as f(x)
= 2x–3 for 1 < X. Determine the mean of X.
5.20 Integration by parts is required. The probability density function for the
diameter of a drilled hole in millimeters is 10e–10(x – 5) for x > 5mm. Although the
target diameter is 5 millimeters, vibrations, tool wear, and other nuisances
produce diameters larger than 5 millimeters.
(a) Determine the mean and variance of the diameter of the holes.
(b) Determine the probability that a diameter exceeds 5.1 millimeters.
5.21 Suppose the probability density function of the length of computer cables is f (x)
= 0.1 from 1200 to 1210 millimeters.
(a) Determine the mean and standard deviation of the cable length.
(b) If the length specifications are 1195 < x < 1205 millimeters, what proportion
of cables are within specifications
A random variable x is said to be uniformly distributed over the interval (α, β) if its
probability density function is given by
1
α< x<β
f ( x) = β − α
0 otherwise
A graph of this function is given in figure 4. Note that the foregoing meets the
requirements of a continuous density function state in conditions (1) & (2) above. A
uniform distribution arises in practice when
we suppose that a particular random variable
is equally likely to be near any value in the f(x)
interval (α, β). The mean, variance and the
moment generating function are, 1
α+β β−α
µ = E ( x) =
2
σ2 =
(β − α )2
12
α β
e tβ − e tα x
m x (t ) =
t (β − α )
Figure 4 Uniform distribution
Example 6.4
The current in a semiconductor diode is often measured by the Shockley equation
(
I = I 0 e aV − 1 )
where V is the voltage across the diode; I0 is the reverse current; a is a constant; and I
is the resulting diode current. Find E(I) if a = 5, I0 = 10–6, and V is uniformly
distributed over (1, 3).
[ ( )]
E (I ) = E I 0 e aV − 1
= I E (e − 1)
0
aV
= I [E (e ) − 1]
0
aV
3 1
= 10 −6 ∫ e 5 x dx − 1
1 2
e15 − e 5
= 10 −6 − 1
10
= 0.3269
Exercises
5.22 The net weight in pounds of a packaged chemical herbicide is uniform for
49.75 < x < 50.25 pounds.
(a) Determine the mean and variance of the weight of packages.
(b) Determine the cumulative distribution function of the weight of packages.
(c) Determine p(X < 50.1)
5.24 Suppose the time it takes a data collection operator to fill out an electronic form
for a database is uniformly between 1.5 and 2.2 minutes.
(a) What is the mean and variance of the time it takes an operator to fill out the
form?
(b) What is the probability that it will take less than two minutes to fill out the
form?
(c) Determine the cumulative distribution function of the time it takes to fill out
the form.
5.25 The probability density function of the time it takes a hematology cell counter to
complete a test on a blood sample is f(x) = 0.2 for 50 < x < 75 seconds.
(a) What percentage of tests require more than 70 seconds to complete.
(b) What percentage of tests require less than one minute to complete.
(c) Determine the mean and variance of the time to complete a test on a sample.
5.27 The probability density function of the time required to complete an assembly
operation is f(x) = 0.1 for 30 < x < 40 seconds.
(a) Determine the proportion of assemblies that requires more than 35 seconds
to complete.
Exponential distribution
Example 6.6
Defective parts are produced at a factory on an average every 5.3 minutes. If it is
assumed that the time between the production of two defective parts follow an
exponential distribution, find the probability
(a) time between the production of two defective item would be less than 3.5
minutes,
(b) time between the production of two defective items would be more than 2
minutes,
(c) time between the production of two defective items would be between 2.5 to
4.5 minutes.
Since we have been told that the distribution is exponential with mean 5.3, we can
write the distribution as
1 − x / 5.3
f ( x) = e
5 .3
The cumulative distribution is
F(x) = 1 – e–x/5.3
With these tools, we can start solving the problems
(a) p(x < 3.5) = F(3.5) = 1 – e–3.5/5.3 = 0.5166
Example 6.7
The waiting time at gas stations is assumed to follow an exponential distribution. At
a particular gas station, the average waiting time is 3.8 minutes. Find the probability
of
(a) the waiting time would be more than 1 minute,
(b) the waiting time would be more than 2 minutes,
(c) the waiting time would be more than 3 minutes
(d) the waiting time would be more than 3 minutes, given that the waiting time is
more than 1 min.
The last result requires a little bit of interpretation. We were asked to find the
probability that the waiting time would be more than 3 minutes, given that we have
already waited for 2 minute. This means that after a wait of 2 minutes, we have been
asked to find the probability of waiting for 1 more minutes. This turn out exactly the
same as the waiting time for more than 1 minute even if we were not given the
information that we have already waited for 2 minutes. This is something like that the
distribution has ‘forgotten’ that it has already waited for 2 minutes. This is called the
‘memoryless’ property of exponential distribution. In general, the memoryless
property is
p(x > s + t | x > s) = p(x > t)
Poisson distribution and exponential distribution have very close relationship. Both
these distributions are used to model queuing systems. Poisson distribution is used to
model the rate at which the events happen; while exponential distribution is used to
model the time between two successive events. For example, we may model the rate
at which customers arrive at a service center using Poisson distribution. In this case,
the time between arrivals of two customers would follow exponential distribution. It
should be understood that neither the Poisson process, nor the exponential process are
to be taken for granted. But, what is to be taken for granted is that if arrival rate
follows Poisson distribution with mean arrival rate λ, the inter-arrival time would
follow exponential distribution with mean inter-arrival time β = 1/λ. Furthermore, if
the inter-arrival time is exponentially distributed with mean inter-arrival time β, the
arrival rate would follow a Poisson distribution with mean arrival rate λ = 1/β.
Example 6.8
Cars arrive at a toll plaza at a rate of 6.5 cars/min and is assumed to follow Poisson
distribution. Find the probability that (a) in any given minute 4 to 6 cars will arrive;
(b) gap between two successive arrivals is less than 15 secs.
(a) We have been asked to find p(4 < x < 6) = p(x = 4) + p(x = 5) + p(x = 6)
6.5 4 e −6.5 6.5 5 e −6.5 6.5 6 e −6.5
p= + + = 0.4146
4! 5! 6!
(b) This is a problem of exponential distribution. Hence β = 1/6.5 = 0.1538 min =
9.23 sec. Therefore p(t < 15) = F(15) = 1 – e–15.0/9.23 = 0.8031
Exercises
5.29 Suppose X has an exponential distribution with mean equal to 10. Determine the
following:
(a) p(X > 10)
(b) p(X > 20)
(c) p(X > 30)
(d) Find the value of x such that p(X < x) = 0.95.
5.30 Suppose the counts recorded by a Geiger counter follow a Poisson process with
an average of two counts per minute.
(a) What is the probability that there are no counts in a 30-second interval?
(b) What is the probability that the first count occurs in less than 10 seconds?
(c) What is the probability that the first count occurs between 1 and 2 minutes
after start-up?
5.31 Suppose that the log-ons to a computer network follow a Poisson process with
an average of 3 counts per minute.
(a) What is the mean time between counts?
(b) What is the standard deviation of the time between counts?
(c) Determine x such that the probability that at least one count occurs before
time x minutes is 0.95.
5.32 The time to failure (in hours) for a laser in a cytometry machine is modeled by
an exponential distribution with β = 25,000 hours.
(a) What is the probability that the laser will last at least 20,000 hours?
(b) What is the probability that the laser will last at most 30,000 hours?
(c) What is the probability that the laser will last between 20,000 and 30,000
hours?
5.34 The life of automobile voltage regulators has an exponential distribution with a
mean life of six years. You purchase an automobile that is six years old, with a
working voltage regulator, and plan to own it for six years.
(a) What is the probability that the voltage regulator fails during your
ownership?
(b) If your regulator fails after you own the automobile three years and it is
replaced, what is the mean time until the next failure?
5.35 The time to failure (in hours) of fans in a personal computer can be modeled by
an exponential distribution with β = 3,000 hours.
(a) What proportion of the fans will last at least 10,000 hours?
(b) What proportion of the fans will last at most 7000 hours?
5.36 The time between the arrivals of electronic messages at your computer is
exponentially distributed with a mean of two hours.
(a) What is the probability that you do not receive a message during a two-hour
period?
(b) If you have not had a message in the last four hours, what is the probability
that you do not receive a message in the next two hours?
(c) What is the expected time between your fifth and sixth messages?
(b) Suppose you have already been waiting for one hour for a taxi, what is the
probability that one arrives within the next 10 minutes?
(c) Determine x such that the probability that you wait more than x minutes is
0.10.
(d) Determine x such that the probability that you wait less than x minutes is
0.90.
(e) Determine x such that the probability that you wait less than x minutes is
0.50.
5.40 When a bus service reduces fares, a particular trip from New York City to
Albany, New York, is very popular. A small bus can carry four passengers. The
time between calls for tickets is exponentially distributed with a mean of 30
minutes. Assume that each call orders one ticket. What is the probability that the
bus is filled in less than 3 hours from the time of the fare reduction?
5.41 The time between arrivals of small aircraft at a county airport is exponentially
distributed with a mean of one hour.
(a) What is the probability that more than three aircraft arrive within an hour?
(b) If 30 separate one-hour intervals are chosen, what is the probability that no
interval contains more than three arrivals?
(c) Determine the length of an interval of time (in hours) such that the
probability that no arrivals occur during the interval is 0.10.
5.42 The time between calls to a corporate office is exponentially distributed with a
mean of 10 minutes.
(a) What is the probability that there are more than three calls in one-half hour?
(b) What is the probability that there are no calls within one-half hour?
(c) Determine x such that the probability that there are no calls within x hours is
0.01.
(d) What is the probability that there are no calls within a two-hour interval?
(e) If four non-overlapping one-half hour intervals are selected, what is the
probability that none of these intervals contains any call?
(f) Explain the relationship between the results in part (d) and (e).
5.43 If the random variable X has an exponential distribution with mean θ, determine
the following:
(a) p(X > θ) (b) p(X > 2θ)
(c) p(X > 3θ)
(d) How do the results depend on θ?
5.44 Assume that the flaws along a magnetic tape follow a Poisson distribution with a
mean of 0.2 flaw per meter. Let X denote the distance between two successive
flaws.
(a) What is the mean of X?
(b) What is the probability that there are no flaws in 10 consecutive meters of
tape?
(c) Does your answer to part (b) change if the 10 meters are not consecutive?
(d) How many meters of tape need to be inspected so that the probability that at
least one flaw is found is 90%?
(e) What is the probability that the first time the distance between two flaws
exceeds 8 meters is at the fifth flaw?
(f) What is the mean number of flaws before a distance between two flaws
exceeds 8 meters?
Example 6.5
∞
∫z e
3 −z
Evaluate dz
0
To evaluate this integral using the techniques of elementary calculus would require
successive application of integration by parts. This integration can be integrated very
quickly using Gamma function.
∞ ∞
∫ z e dz = ∫ z e dz = Γ(4)
3 −z 4 −1 − z
0 0
0.8
0.6 α=1
f(x)
0.4 α=2
α=3
α=4 α=5
0.2
0
0 1 2 3 4 5 6 7 8 9
x
Figure 5. Gamma distribution for different values of α.
0.8 α = 1, β = 1
0.6 α = 1, β = 2
f(x)
α = 1, β = 3
0.4
α = 3, β = 1
0.2 α = 3, β = 2
α = 3, β = 3
0
0 1 2 3 4 5 6 7 8 9
x
Figure 6. Gamma distribution for different values of β.
The function Γx(α) is called the incomplete gamma function, and is defined as
x
Γx (α) = ∫ z α −1e − z dz
0
The graphical nature of gamma distribution is shown in figure 5 for different values of
α, and in figure 6 for different values of β. With the graphical representation of the
gamma distribution, we are in a better position to give interpretation for the
parameter. The quantity α is called the shape parameter which sets the overall shape
of the distribution, and the quantity β is called the scale parameter which stretches the
parameter.
We can see that exponential distribution is a special case of Gamma distribution with
α = 1.
Gamma distribution with integer values of α is also called Erlang distribution. Erlang
distribution is used very widely in Teletraffic engineering.
Chi-square distribution
The gamma distribution gives rise to another important family of random variables,
namely, the chi-squared family. This distribution is used extensively in statistics.
Among other things, it provides the basis for making inferences about the variance of
a population based on a sample. We will just state the definition of this distribution
here, and leave the issue of showing its application for later.
Definition. Let x be a gamma random variable with β = 2 and α = γ/2 for g a positive
integer. x is said to have a chi-squared distribution with γ degrees of freedom. We
denote this variable by x γ2 .
We will end the discussion of this distribution only with the statement that with this
distribution, we will normally find the probability p( x γ2 > χ r2 ) .
Normal distribution
The normal distribution is a distribution that underlies many of the statistical methods
used in data analysis. This distribution was first describes by de Moivre in 1733 as
being the limiting form of binomial distribution with the number of trials approaching
infinity. This was again described by Laplace and Gauss about half century later as
they were trying to models errors in astronomical measurements. This distribution is
often referred to a ‘Gaussian’ distribution.
2πσ
is said to have a normal distribution with
mean µ and standard deviation σ.
µ
Lecture Notes on Independent University, Bangladesh
Probability and Statistics Figure 7. The normal distribution.
130 Continuous Probability Distributions
2
∞ 1 x −µ
1 −
∫ dx = 1
2 σ
e
−∞ 2πσ
The moment generating function is
σ 2t 2
µt +
m x (t ) = e 2
Standard normal
It can be observed that the normal distribution depends upon the mean, µ, and
standard deviation, σ, of the random variable. It is also understood that there are
infinite possibilities of both the mean and the standard deviation. This means that it
would be necessary to perform a separate integration for each set of mean and
standard deviation. To avoid this situation, we perform a transformation on the normal
distribution using
x−µ
z=
σ
This transformation converts all normal distribution into another normal distribution
with mean 0 and standard deviation 1. The new variable z is called the standard
normal. The standard normal denotes the variation from the mean in terms of the
standard deviation. For example, if a particular value of the random variable x,
suppose x0, corresponds to a standard normal z0, this means that the value x0 is z0
standard deviations away from the mean of the random variable x.
The area under the standard normal curve is usually calculated numerically, and the
values appear in tabular form. An example is shown in Table 1.
Table 1. Tabulated values of the standard normal.
z
1
∫e
−z2 / 2
dz
2π −∞
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-2.9 0.00187 0.00181 0.00176 0.00170 0.00165 0.00159 0.00154 0.00149 0.00145 0.00140
-2.8 0.00256 0.00249 0.00241 0.00234 0.00226 0.00219 0.00213 0.00206 0.00200 0.00193
-2.7 0.00348 0.00337 0.00327 0.00318 0.00308 0.00299 0.00290 0.00281 0.00273 0.00264
-2.6 0.00468 0.00454 0.00441 0.00428 0.00416 0.00404 0.00392 0.00380 0.00369 0.00358
-2.5 0.00623 0.00605 0.00588 0.00572 0.00556 0.00540 0.00525 0.00510 0.00495 0.00481
For the purpose of solving problems, values are read directly from the table.
Example 6.9
The Transient Voltage Suppressor diode 1.5KE6V8A is supposed to have a Reverse
Standoff Voltage (VRWM) of 5.8 V. A random sample of this diode collected from
different manufacturer shows a mean of 5.83 V with a standard deviation 0f 0.18 V.
If the reverse standoff voltage is known to follow a normal distribution, find the
probability that if a diode is selected at random VRWM would be found to be between
5.7 V to 5.9 V.
We have been given a problem with mean 5.83 and standard deviation 0.18. We
have been asked to find p(5.7 < x < 5.9). To find the probability, we will convert the
variable x into standard normal z. In terms of standard normal, we have to find
5.7 − 5.83 5.9 − 5.83
p <z<
0.18 0.18
= p(–0.72 < z < 0.39)
= p(z < 0.39) – p(z < –0.72)
From standard normal table, this can be found to be
= 0.6517 – 0.2358 = 0.4159
Therefore, p(5.7 < x < 5.9) = 0.4159
It is extremely instructive to pay attention to the values in the standard normal table.
This table can have many variations — all related to each other. Because the normal
curve is symmetrical, the z values may start from 0.0 instead of some negative value.
In this case, the area would start with 0.5.
Sometimes, it may also be necessary to read the z value from the table.
Example 6.10
In very large classes, the grades of the students are known to follow normal
distribution. In a particular class of more than 500 students, the mean marks is 64
with standard deviation 9.1. If the professor has decided that he will give the top
15% the grade ‘A’, and the bottom 15% the grade ‘F’, what are the cut-off marks for
‘A’ and ‘F’ grades?
Here we are required to find the z value, separately, so that the area on the right and
the area on the left under the curve is 0.15.
For the area on the left, we look for 0.15 within the body of the table, and we find the
x−µ
corresponding z to be –2.17. Therefore = −2.17 . We have been given µ = 64,
σ
and σ = 9.1. Therefore, we obtain x = 44 for the cut-off marks for ‘F’ grade.
x−µ
Similarly, considering = 2.17 would lead us to x = 84 as the cut-off marks for
σ
‘A’ grade.
Example 6.11
For the Transient Voltage Suppressor diode 1.5KE6V8A we saw in example 6.9,
suppose that the Maximum Peak Pulse Current (IPP) ranges from 140 A to 145 A.
Find the probability that for a diode selected at random, IPP would be more than
143 A.
We will assume that the Maximum Peak Current follows a normal distribution. As
the result of the discussion on the range of a random variable and its relationship
with the mean and standard deviation, we can find
140 + 145
Mean, µ = = 142.5
2
145 − 140
Standard deviation, σ = = 0.8333
6
143 − 142.5
The standard normal corresponding to x = 143 is z = = 0 .6
0.833
From the normal table, p(z > 0.6) = 0.2743
Example 6.12
In Example 4.1 of lecture 4 we saw the maximum and minimum temperature in July
of Dhaka from 1930 till 1990. The minimum maximum-temperature was 29.8°C and
the maximum maximum-temperature was 32.4°C. What percentage of years had
temperature more than 32°C?
We assume the behavior of the maximum temperature over the years will follow a
normal distribution. The mean and the standard deviation can be estimated as
29.8 + 32.4
Mean, µ = = 31.1
2
32.4 − 29.8
Standard deviation, σ = = 0.433
6
32 − 31.1
Therefore, the standard normal for x = 32 is z = = 2.08
0.433
From the standard normal table p(z > 2.08) = 0.0188
Inspecting the table shows us that we have data for 54 years between 1931 and 1990.
Therefore the number of years with temperature more than (and equal to 32°C)
would be approximately 0.0188 × 54 = 1.01, or approximately 1 year.
Conclusion: Our fundamental premise that the behavior of the maximum temperature
over the years will follow a normal distribution is incorrect!
Example 6.13
A study is performed to investigate the effect of stormy weather on the quality of cell
phone audio. A sample of audio collected during stormy weather shows that 45% of
the sample have more noise than the accepted levels. In a randomly collected of 30
samples during stormy weather, what is the probability that more than 10 samples
would have noise more than the accepted level?
In this problem, probability that a sample collected during stormy weather would
have more noise than accepted level, p = 0.45. The number of samples, n = 30, and
the value of the random variable x = 10. We have to find p(x > 10). We will use
normal approximation to binomial distribution.
First, though, let us establish that this, in fact, is a problem of binomial distribution.
A randomly collected sample of audio during stormy weather would either clean, or
noisy. Therefore, the quality checking would be a Bernoulli trial. Hence this is a
problem of binomial distribution.
To use normal distribution for this problem, the mean, µ = 0.45 × 30 = 13.5, the
standard deviation σ = 30 × 0.45 × (1 − 0.45) = 2.725 . For the value x = 10, the
x − µ 10 − 13.5
standard normal z = = = −1.28 . Therefore, p(x > 10) = p(z > –1.28)
σ 2.725
= 1 – 0.1003 = 0.8997.
Exercises
5.45 Use the standard normal table to determine the following probabilities for the
standard normal random variable Z:
(a) p(Z < 1.32) (b) p(Z < 3.0)
(c) p(Z > 1.45) (d) p(Z > −2.15)
(e) p(−2.34 < Z < 1.76)
5.46 Use the standard normal table to determine the following probabilities for the
standard normal random variable Z:
(a) p(−1 < Z < 1) (b) p(−2 < Z < 2)
(c) p(−3 < Z < 3) (d) p(Z > 3)
(e) p(0 < Z < 1)
5.47 Assume Z has a standard normal distribution. Use the standard normal table to
determine the value for z that solves each of the following:
(a) p(Z < z) = 0.9 (b) p(Z < z) = 0.5
(c) p(Z > z) = 0.1 (d) p(Z > z) = 0.9
(e) p(−1.24 < Z < z) = 0.8
5.48 Assume Z has a standard normal distribution. Use the standard normal table to
determine the value for z that solves each of the following:
(a) p(−z < Z < z) = 0.95 (b) p(−z < Z < z) = 0.99
(c) p(−z < Z < z) = 0.68 (d) p(−z < Z < z) = 0.9973
5.54 The tensile strength of paper is modeled by a normal distribution with a mean of
35 pounds per square inch and a standard deviation of 2 pounds per square inch.
(a) What is the probability that the strength of a sample is less than 40 lb/in2?
(b) If the specifications require the tensile strength to exceed 30 lb/in2, what
proportion of the samples is scrapped?
5.56 The fill volume of an automated filling machine used for filling cans of
carbonated beverage is normally distributed with a mean of 12.4 fluid ounces
and a standard deviation of 0.1 fluid ounce.
(a) What is the probability a fill volume is less than 12 fluid ounces?
(b) If all cans less than 12.1 or greater than 12.6 ounces are scrapped, what
proportion of cans is scrapped?
(c) Determine specifications that are symmetric about the mean that include
99% of all cans.
5.57 The time it takes a cell to divide (called mitosis) is normally distributed with an
average time of one hour and a standard deviation of 5 minutes.
(a) What is the probability that a cell divides in less than 45 minutes?
(b) What is the probability that it takes a cell more than 65 minutes to divide?
(c) What is the time that it takes approximately 99% of all cells to complete
mitosis?
5.58 In the previous exercise, suppose that the mean of the filling operation can be
adjusted easily, but the standard deviation remains at 0.1 ounce.
(a) At what value should the mean be set so that 99.9% of all cans exceed 12
ounces?
(b) At what value should the mean be set so that 99.9% of all cans exceed 12
ounces if the standard deviation can be reduced to 0.05 fluid ounce?
5.59 The reaction time of a driver to visual stimulus is normally distributed with a
mean of 0.4 seconds and a standard deviation of 0.05 seconds.
(a) What is the probability that a reaction requires more than 0.5 seconds?
(b) What is the probability that a reaction requires between 0.4 and 0.5 seconds?
(c) What is the reaction time that is exceeded 90% of the time?
5.60 The speed of a file transfer from a server on campus to a personal computer at a
student’s home on a weekday evening is normally distributed with a mean of 60
kilobits per second and a standard deviation of 4 kilobits per second.
(a) What is the probability that the file will transfer at a speed of 70 kilobits per
second or more?
(b) What is the probability that the file will transfer at a speed of less than 58
kilobits per second?
(c) If the file is 1 megabyte, what is the average time it will take to transfer the
file? (Assume eight bits per byte.)
5.61 The length of an injection-molded plastic case that holds magnetic tape is
normally distributed with a length of 90.2 millimeters and a standard deviation
of 0.1 millimeter.
(a) What is the probability that a part is longer than 90.3 millimeters or shorter
than 89.7 millimeters?
(b) What should the process mean be set at to obtain the greatest number of
parts between 89.7 and 90.3 millimeters?
(c) If parts that are not between 89.7 and 90.3 millimeters are scrapped, what is
the yield for the process mean that you selected in part (b)?
5.62 In the previous exercise assume that the process is centered so that the mean is
90 millimeters and the standard deviation is 0.1 millimeter. Suppose that 10
cases are measured, and they are assumed to be independent.
(a) What is the probability that all 10 cases are between 89.7 and 90.3
millimeters?
(b) What is the expected number of the 10 cases that are between 89.7 and 90.3
millimeters?
(c) If three lasers are used in a product and they are assumed to fail
independently, what is the probability that all three are still operating after
7000 hours?
5.65 The diameter of the dot produced by a printer is normally distributed with a
mean diameter of 0.002 inch and a standard deviation of 0.0004 inch.
(a) What is the probability that the diameter of a dot exceeds 0.0026 inch?
(b) What is the probability that a diameter is between 0.0014 and 0.0026 inch?
(c) What standard deviation of diameters is needed so that the probability in part
(b) is 0.995?
5.66 The weight of a sophisticated running shoe is normally distributed with a mean
of 12 ounces and a standard deviation of 0.5 ounce.
(a) What is the probability that a shoe weighs more than 13 ounces?
(b) What must the standard deviation of weight be in order for the company to
state that 99.9% of its shoes are less than 13 ounces?
(c) If the standard deviation remains at 0.5 ounce, what must the mean weight
be in order for the company to state that 99.9% of its shoes are less than 13
ounces?
5.67 Suppose that X is a binomial random variable with n = 200 and p = 0.4.
(a) Approximate the probability that X is less than or equal to 70.
(b) Approximate the probability that X is greater than 70 and less than 90.
5.68 Suppose that X is a binomial random variable with n = 100 and p = 0.1.
(a) Compute the exact probability that X is less than 4.
(b) Approximate the probability that X is less than 4 and compare to the result in
part (a).
(c) Approximate the probability that 8 < X < 12.
5.71 An electronic office product contains 5000 electronic components. Assume that
the probability that each component operates without failure during the useful
life of the product is 0.999, and assume that the components fail independently.
Approximate the probability that 10 or more of the original 5000 components
fail during the useful life of the product.
5.73 A corporate Web site contains errors on 50 of 1000 pages. If 100 pages are
sampled randomly, without replacement, approximate the probability that at
least 1 of the pages in error are in the sample.
5.74 Hits to a high-volume Web site are assumed to follow a Poisson distribution
with a mean of 10,000 per day. Approximate each of the following:
(a) The probability of more than 20,000 hits in a day
(b) The probability of less than 9900 hits in a day
(c) The value such that the probability that the number of hits in a day exceed
the value is 0.01
(d) Approximate the expected number of days in a year (365 days) that exceed
10,200 hits.
(e) Approximate the probability that over a year (365 days) more than 15 days
each have more than 10,200 hits.
5.75 The percentage of people exposed to a bacteria who become ill is 20%. Assume
that people are independent. Assume that 1000 people are exposed to the
bacteria. Approximate each of the following.
(a) The probability that more than 225 become ill
(b) The probability that between 175 and 225 become ill
(c) The value such that the probability that the number of people that become ill
exceeds the value is 0.01
Other distributions
fundamental problem.
Exercises
5.77 Suppose that f(x) = 0.5x – 1 for 2 < x < 4. Determine the following:
(a) p(X < 2.5)
(b) p(X > 3)
(c) p(2.5 < X < 3.5)
(d) Determine the cumulative distribution function of the random variable.
5.78 The time between calls is exponentially distributed with a mean time between
calls of 10 minutes.
(a) What is the probability that the time until the first call is less than 5 minutes?
(b) What is the probability that the time until the first call is between 5 and 15
minutes?
(c) Determine the length of an interval of time such that the probability of at
least one call in the interval is 0.90.
(d) If there has not been a call in 10 minutes, what is the probability that the
time until the next call is less than 5 minutes?
(e) What is the probability that there are no calls in the intervals from 10:00 to
10:05, from 11:30 to 11:35, and from 2:00 to 2:05?
(f) What is the probability that the time until the third call is greater than 30
minutes?
(g) What is the mean time until the fifth call?
5.79 The CPU of a personal computer has a lifetime that is exponentially distributed
with a mean lifetime of six years. You have owned this CPU for three years.
What is the probability that the CPU fails in the next three years? Assume that
your corporation has owned 10 CPUs for three years, and assume that the CPUs
fail independently. What is the probability that at least one fails within the next
three years?
5.80 Asbestos fibers in a dust sample are identified by an electron microscope after
sample preparation. Suppose that the number of fibers is a Poisson random
variable and the mean number of fibers per squared centimeter of surface dust is
100. A sample of 800 square centimeters of dust is analyzed. Assume a
particular grid cell under the microscope represents 1/160,000 of the sample.
(a) What is the probability that at least one fiber is visible in the grid cell?
(b) What is the mean of the number of grid cells that need to be viewed to
observe 10 that contain fibers?
(c) What is the standard deviation of the number of grid cells that need to be
viewed to observe 10 that contain fibers?
5.81 Without an automated irrigation system, the height of plants two weeks after
germination is normally distributed with a mean of 2.5 centimeters and a
standard deviation of 0.5 centimeters.
(a) What is the probability that a plant’s height is greater than 2.25 centimeters?
(b) What is the probability that a plant’s height is between 2.0 and 3.0
centimeters?
(c) What height is exceeded by 90% of the plants?
(d) With an automated irrigation system, a plant grows to a height of 3.5
centimeters two weeks after germination. What is the probability of
obtaining a plant of this height or greater.
5.82 The thickness of a laminated covering for a wood surface is normally distributed
with a mean of 5 millimeters and a standard deviation of 0.2 millimeter.
(a) What is the probability that a covering thickness is greater than 5.5
millimeters?
(b) If the specifications require the thickness to be between 4.5 and 5.5
millimeters, what proportion of coverings do not meet specifications?
(c) The covering thickness of 95% of samples is below what value?
5.83 The diameter of the dot produced by a printer is normally distributed with a
mean diameter of 0.002 inch. Suppose that the specifications require the dot
diameter to be between 0.0014 and 0.0026 inch. If the probability that a dot
meets specifications is to be 0.9973, what standard deviation is needed? Assume
that the standard deviation of the size of a dot is 0.0004 inch. If the probability
that a dot meets specifications is to be 0.9973, what specifications are needed?
Assume that the specifications are to be chosen symmetrically around the mean
of 0.002.
5.86 An airline makes 200 reservations for a flight that holds 185 passengers. The
probability that a passenger arrives for the flight is 0.9 and the passengers are
assumed to be independent.
(a) Approximate the probability that all the passengers that arrive can be seated.
(b) Approximate the probability that there are empty seats.
(c) Approximate the number of reservations that the airline should make so that
the probability that everyone who arrives can be seated is 0.95. [Hint:
Successively try values for the number of reservations.]