KEMBAR78
Random Variables and Distributions | PDF
0% found this document useful (0 votes)
39 views59 pages

Random Variables and Distributions

Uploaded by

killian.larde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views59 pages

Random Variables and Distributions

Uploaded by

killian.larde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Ch.

4 – Random Variables
Content
4.1 Random Variables
4.2 Discrete Random Variables
4.3 Expected Value
4.4 Expectation of a Function of a Random Variable
4.5 Variance
4.6 The Bernoulli and Binomial Random Variables
4.7 The Poisson Random Variable
4.8 Other Discrete Probability Distributions
4.9 Expected Value of Sums of Random Variables
4.10 Properties of the Cumulative Distribution function

Random Variables
Introduction
Random variable: “A (mathematical) function mapping events in a sample space S onto the
real numbers.”
Random variables are denoted by capital letters (often X, Y and Z). Therefore by definition:
𝑋: 𝑆 → ℝ
Particular values of random variables are denoted by lowercase letters (e.g. X = x)
We are often interested in some function of the outcome as opposed to the actual outcome.
For example we interested in the total number of heads appearing in 10 throws. We are not
interested in the actual order of the heads and tails sequence just the total. These quantities
or real-valued functions defined on S are known as random variables.
The value of a random variable is determined by the outcome of the experiment and as a
result we can assign probabilities to the possible values of the random variable.
Types of Random Variables
The distinction between the 2 types is based on the range of values that they can assume.

Discrete Random Variable


• X values form a finite or an infinitely countable set of real numbers.
o i.e. countable number of distinct values.
• We calculate probabilities through summation.

Continuous Random Variable


• X values form a continuous interval of real numbers.
o i.e. uncountable values within an interval.
• We calculate probabilities through integration.

Examples of Random Variables


When a numerical value is assigned to an outcome, a random variable (X) is defined.

Discrete Random Variables


1. Roll a 6-sided die
𝑆 = {1, 2, 3, 4, 5, 6}
Let X = number rolled, then
𝑿 ∈ {𝟏, 𝟐, 𝟑, 𝟒, 𝟓, 𝟔}

2. Two items selected from a production line and investigated whether they are defective or
not. D = defective and N = non-defective.
𝑆 = {(𝐷, 𝐷), (𝐷, 𝑁), (𝑁, 𝐷), (𝑁, 𝑁)}
Let X = number of defective items, then
𝑿 ∈ {𝟎, 𝟏, 𝟐}

Continuous Random Variables


3. Let X = the time (minutes) taken by a student to complete a 60-minute exam. Then:
0 ≤ 𝑋 ≤ 60

Note:
• The probability distribution of discrete random variables consists of the values which it
assumes, together with a corresponding probability for each of the values occurring.
Introductory Example
Suppose we have an unfair, biased coin, for which P(Heads) = 0.4, and therefore P(Tails) =
0.6. Our experiment consists of tossing this coin twice, and we will define the random
variable X = the number of heads obtained.
The sample space for this experiment is: (if we write H for Heads and T for Tails).
𝑺 = {𝑯𝑯; 𝑯𝑻; 𝑻𝑯; 𝑻𝑻}
Therefore: Implying that the possible values that X can take are 0, 1, or 2.
Outcome HH HT TH TT
Nr of Heads 2 1 1 0
Probability 0.4 × 0.4 = 0.16 0.4 × 0.6 = 0.24 0.6 × 0.4 = 0.24 0.6 × 0.6 = 0.36

The probability distribution of X can be written down as:


𝑷(𝑿 = 𝟎) = 𝟎. 𝟑𝟔
𝑷(𝑿 = 𝟏) = 𝟎. 𝟒𝟖
𝑷(𝑿 = 𝟐) = 𝟎. 𝟏𝟔
The total probability distribution must sum to 1.
Textbook Examples
Example 4.1a
Suppose that our experiment consists of tossing 3 fair coins. If we let denote the number of
heads that appear, then what values can the random variable X take on?
Solution:
Let X = random variable that denotes the number of heads appearing
1 1 1 𝟏
𝑃(𝑋 = 0) = 𝑃{(𝑇, 𝑇, 𝑇)} = × × =
2 2 2 𝟖
1 3 1 3 1 3 𝟑
𝑃(𝑋 = 1) = 𝑃{(𝐻, 𝑇, 𝑇), (𝑇, 𝐻, 𝑇), (𝑇, 𝑇, 𝐻)} = ( ) + ( ) + ( ) =
2 2 2 𝟖
1 3 1 3 1 3 𝟑
𝑃(𝑋 = 2) = 𝑃{(𝐻, 𝐻, 𝑇), (𝑇, 𝐻, 𝐻), (𝐻, 𝑇, 𝐻)} = ( ) + ( ) + ( ) =
2 2 2 𝟖
1 1 1 𝟏
𝑃(𝑋 = 3) = 𝑃{(𝐻, 𝐻, 𝐻)} = × × =
2 2 2 𝟖
Since X must take on one of the values 0 through 3, we must have:
3 3

𝑃 (⋃{𝑋 = 𝑖}) = ∑ 𝑃({𝑋 = 𝑖})


𝑋=𝑖 𝑖=0
Example 4.1a extended
a) What is P(0<X≤2)?

Solution:
𝑃(0 < 𝑋 ≤ 2) = 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)
3 3 3
= + =
8 8 4

b) Create a general formula for the probability distribution:

Solution:
3
3 1
𝑃(𝑋 = 𝑥) = ( ) ( )
𝑥 2

c) Represent the distribution graphically

Solution:

Example 4.1c
Four balls are to be randomly selected, without replacement, from an urn that contains balls
20 numbered 1 through 20. If X is the largest numbered ball selected, then X is a random
variable that takes on one of the values which values?
Solution:
20
4, 5, …,20. Because each of the ( ) possible selections of 4 of 20 the balls is equally likely,
4
the probability that takes on each of its possible values is:
𝑖−1
( )
𝑃(𝑋 = 𝑖) = 3 , 𝑖 = 4,5, … 20
20
( )
4
This is so because the number of selections that result in X = i is the number of selections
that result in ball numbered i and three of the balls numbered 1 through i – 1 being selected.
1 𝑖−1
As there are ( ) ( ) such selections, the preceding equation follows.
1 3
In this equation i is the largest numbered ball chosen, with 3 balls chosen that are numbered
less than i and no balls with a number larger than i were selected.

Another important thing to note is that all possible values of X sum up to 1.


20

∑ 𝑃(𝑋 = 𝑖) = 1
𝑖=4

Suppose now that we want to determine P(X > 10). One way, of course, is to just use the
preceding to obtain:
20

𝑃(𝑋 > 10) = 𝑃(𝑋 = 11) + 𝑃(𝑋 = 12) + ⋯ + 𝑃(𝑋 = 19) + 𝑃(𝑋 = 20) = ∑ 𝑃(𝑋 = 𝑖)
𝑖=11

20 𝑖−1
20 10 11 19
( ) ( ) + ( ) + ⋯+ ( )
∑ 𝑃(𝑋 = 𝑖) = ∑ 3 = 3 3 3
20 20
𝑖=11 𝑖=11 ( ) ( )
4 4
However, a more direct approach for determining P(X > 10) would be to use:
𝑃(𝑋 > 10) = 1 − 𝑃(𝑋 ≤ 10)
10𝑖−1
( )
=1−∑ 3
20
𝑖=4 ( )
4
10 Last step:
( )
=1− 4 𝑛
( )=(
𝑛−1
)+(
𝑛−1
)
20 𝑟 4−1 𝑟
( )
4
where the preceding results because X will be less than or equal to 10 when the 4 balls chosen
are among balls numbered 1 through 10.
Example 4.1d
Independent trials consisting of the flipping of a coin having probability p of coming up heads
are continually performed until either a head occurs or a total of n flips is made. If we let X
denote the number of times the coin is flipped, then X is a random variable taking on one of
the values 1, 2, 3, …, n with respective probabilities.
Solution:
𝑃(𝑋 = 1) = 𝑃{(𝐻)} = 𝑝
𝑃(𝑋 = 2) = 𝑃{(𝑇, 𝐻)} = (1 − 𝑝)𝑝
𝑃(𝑋 = 3) = 𝑃{(𝑇, 𝑇, 𝐻)} = (1 − 𝑝)2 𝑝

𝑃(𝑋 = 𝑛 − 1) = 𝑃{(𝑇, 𝑇, … 𝑇, 𝐻)} = (1 − 𝑝)𝑛−2 𝑝
𝑃(𝑋 = 𝑛) = 𝑃{(𝑇, 𝑇, … 𝑇, 𝑇), (𝑇, 𝑇, … 𝑇, 𝐻)} = (1 − 𝑝)𝑛−1
Note that:
𝑛 𝑛

𝑃 (⋃{𝑋 = 𝑖}) = ∑ 𝑃({𝑋 = 𝑖}


𝑋=𝑖 𝑖=1
𝑛−1

= ∑ 𝑝(1 − 𝑝)𝑖−1 + (1 − 𝑝)𝑛−1


𝑖=1

1 − (1 − 𝑝)𝑛−1
= 𝑝[ ] + (1 − 𝑝)𝑛−1
1 − (1 − 𝑝)
= 1 − (1 − 𝑝)𝑛−1 + (1 − 𝑝)𝑛−1
=1

Cumulative Distribution Function of a Random Variable


Definition
The cumulative distribution function (cdf) is defined by:
𝑭(𝒙) = 𝑷{𝑿 ≤ 𝒃}, 𝒇𝒐𝒓 𝒂𝒍𝒍 𝒗𝒂𝒍𝒖𝒆𝒔 𝒐𝒇 𝒃

Important Properties of F(.):


1) 𝐹(𝑥) is a non-decreasing function of x, that is, if a < b, then F(a) ≤ F(b)
2) lim 𝐹(𝑏) = 1
𝑛→∞
3) lim 𝐹(𝑏) = 0
𝑛→−∞
4) F is continuous from the right in every b.
Proofs
Property 1
Because for a < b, the event {X ≤ a} is contained in the event {X ≤ b} and so cannot have a
larger probability. Thus, P(X ≤ a) ≤ P(X ≤ b), therefore: F(a) ≤ F(b)
Property 2
If bn increases to infinity, then the events {X ≤ bn}, n ≥ 1, are increasing events whose union
is the event {X < ∞}.
lim 𝑃{𝑋 ≤ 𝑏𝑛 } = 𝑃{𝑋 ≤ 𝑏}
𝑛→∞

Which also verifies property 4.


All probability questions about X can be answered in terms of the c.d.f., F.
𝑃{𝑎 < 𝑋 ≤ 𝑏} = 𝐹(𝑏) − 𝐹(𝑎), 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑎 < 𝑏
The union of the mutually exclusive events {X ≤ a} and {a < X ≤ b}:
{𝑋 ≤ 𝑏} = {𝑋 ≤ 𝑎} ∪ {𝑎 < 𝑋 ≤ 𝑏}
Therefore:
𝑃{𝑋 ≤ 𝑏} = 𝑃{𝑋 ≤ 𝑎} + 𝑃{𝑎 < 𝑋 ≤ 𝑏}
If X is strictly less than b, then:
1
𝑃{𝑋 < 𝑏} = 𝑃 ( lim {𝑋 ≤ 𝑏 − })
𝑛→∞ 𝑛
1
= lim 𝑃 (𝑋 ≤ 𝑏 − )
𝑛→∞ 𝑛
1
= lim 𝐹 (𝑏 − )
𝑛→∞ 𝑛
Summary
𝑃{𝑎 < 𝑋 ≤ 𝑏} = 𝐹(𝑏) − 𝐹(𝑎), 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑎 < 𝑏
𝑃{𝑋 > 𝑏} = 1 − 𝐹(𝑏)
𝑃{𝑋 < 𝑎} = 𝐹(𝑎) − 𝑃{𝑋 = 𝑎}
𝑃{𝑋 ≥ 𝑏} = 1 − 𝐹(𝑏) + 𝑃{𝑋 = 𝑏}
Textbook Example
Example 4.10a
The distribution function of the random variable is given by:

Compute:
a) P{X < 3}

Solution:
1 1 11
𝑃{𝑋 < 3} = lim 𝑃 {𝑋 ≤ 3 − } = lim 𝐹 (3 − ) =
𝑛 𝑛 𝑛 𝑛 12
b) P{X = 1}
Solution:
𝑃{𝑋 = 1} = 𝑃{𝑋 ≤ 1} = 𝑃{𝑋 < 1}
1
= 𝐹(1) − lim 𝐹 (1 − )
𝑛 𝑛
2 1 1
= − =
3 2 6
c) P{X > 0.5}
Solution:
𝑃{𝑋 > 0.5} = 1 − 𝑃{𝑋 ≤ 0.5}
3
= 1 − 𝐹(0.5) =
4
d) P{2 < X ≤ 4}
Solution:
𝑃{2 < 𝑋 ≤ 4} = 𝐹(4) − 𝐹(2)
11 1
=1− =
12 12
Discrete Random Variables
Probability Mass Function of a Random Variable
Definition
The probability mass function (pmf) of a discrete random variable X is defined as:
𝑝(𝑎) = 𝑃{𝑋 = 𝑎}, 𝑓𝑜𝑟 𝑒𝑣𝑒𝑟𝑦 𝑣𝑎𝑙𝑢𝑒 𝒂

Important Properties of p(x):


Note that p(x) ≥ 0 for all values of x. If {x1, x2,…} represents the values which X assumes:
• 𝑃(𝑥𝑖 ) ≥ 0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑥𝑖 , 𝑤ℎ𝑒𝑟𝑒 𝑖 = 1,2, …
• 𝑃(𝑥) = 0 𝑓𝑜𝑟 𝑎𝑛𝑦 𝑜𝑡ℎ𝑒𝑟 𝑣𝑎𝑙𝑢𝑒 𝑥

We also have:

∑ 𝑝(𝑥𝑖 ) = 1
𝑖=1

This can be represented graphically as follows:

Now consider the probability mass function of the random variable representing the sum
when two dice are rolled:
Textbook Examples
Example 4.2a
𝑐𝜆𝑖
The probability mass function of a random variable X is given by 𝑝(𝑖) = , 𝑖 = 0,1,2, …
𝑖!
where λ is some positive value. Find:
a) P{X = 0}

Solution:
Since ∑∞
𝑖=0 𝑝(𝑖) = 1, we have:

𝜆𝑖
𝑐∑ = 1
𝑖!
𝑖=0

𝜆𝑖
Which, because 𝑒 𝑥 = ∑∞
𝑖=0 , implies that
𝑖!

𝑐𝑒 𝜆 = 1 or 𝑐 = 𝑒 −𝜆
Hence:
𝒆−𝝀 𝝀𝟎
𝑷{𝑿 = 𝟎} = = 𝒆−𝝀
𝟎!
b) P{X > 2}
Solution:
𝑃{𝑋 > 2} = 1 − 𝑃{𝑋 ≤ 2}
= 1 − 𝑃{𝑋 = 0} − 𝑃{𝑋 = 1} − 𝑃{𝑋 = 2}

−𝜆 −𝜆
𝜆2 𝑒 −𝜆
=1−𝑒 − 𝜆𝑒 −
2
The cumulative distribution function of a discrete random variable X can also be expressed
in terms of the probability mass function:

𝐹(𝑎) = ∑ 𝑝(𝑥) , 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝑎


𝑥≤𝑎
Additional Examples
Example 4.1
The probability mass function of a discrete random variable X is given by:
1 𝑥−1
𝑝(𝑥) = 𝑐 ( ) , 𝑓𝑜𝑟 𝑥 = 1,2, …
3
Determine:
a) The value of the constant c

Solution:
Since ∑∞
𝑥=1 𝑝(𝑥) = 1, we have:

1 𝑥−1
𝑐∑( ) =1
3
𝑥=1

1 1
⇒ 𝑐 [1 + + + ⋯ ] = 1
3 9

1
⇒ 𝑐[ ]=1
1
1−
3
𝟐
⇒𝒄=
𝟑
b) P{X = 1}

Solution:
𝟐 𝟏 𝟏−𝟏 𝟐
𝑷(𝑿 = 𝟏) = 𝒑(𝟏) = ( ) =
𝟑 𝟑 𝟑
c) P{X > 2}
Solution:
𝑃(𝑋 > 2) = 𝑝(3) + 𝑝(4) + ⋯
= 1 − 𝑃(𝑋 ≤ 2)
= 1 − 𝑝(1) − 𝑝(2)
2 2
=1− −
3 9
𝟏
=
𝟗
Example 4.2
The cumulative distribution function of a discrete random variable X is given by:
0, 𝑓𝑜𝑟 𝑥 < 0
1
, 𝑓𝑜𝑟 0 ≤ 𝑥 < 2
𝐹(𝑥) = 4
2
, 𝑓𝑜𝑟 2 ≤ 𝑥 < 3
3
{ 1, 𝑓𝑜𝑟 𝑥 ≥ 3 }
Determine:
a) The pmf of X

Solution:
1
𝑝(0) =
4
2 1 5
𝑝(2) = − =
3 4 12
2 1
𝑝(3) = 1 − =
3 3
Therefore:
1
, 𝑓𝑜𝑟 𝑥 = 0
4
5
𝑝(𝑥) = 12 , 𝑓𝑜𝑟 𝑥 = 2
1
, 𝑓𝑜𝑟 𝑥 = 3
3
{ 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 }
b) P{0 < X ≤ 2}
Solution:
𝑃(0 < 𝑥 ≤ 2) = 𝐹(2) − 𝐹(0)
2 1 5
= − =
3 4 12
c) P{2 < X ≤ 3}
Solution:
𝑃(2 < 𝑥 ≤ 3) = 𝐹(3) − 𝐹(2)
2 1
=1− =
3 3
Expected Value
The expectation of a random variable and its variance allow us to describe a random variable
in terms of its location and its spread.
The expected value of a random variable is the weighted average or the relative frequency
interpretation of probabilities.
The variance is a measure of how widely the values are dispersed around this central value.

Expected Value
Definition
The expected value of a discrete random variable X with pmf p(x) as:

𝑬(𝑿) = ∑ 𝒙. 𝒑(𝒙)
𝒙:𝒑(𝒙)>𝟎

In words, the expected value of X is a weighted average of the possible values that X can
take on, each value being weighted by the probability that X assumes it.
Example 1
Pmf of X is given by:
1
𝑝(0) = = 𝑝(1)
2
Then
1 1 1
𝐸[𝑋] = 0 ( ) + 1 ( ) =
2 2 2
Is just the ordinary average of the 2 possible values, 0 and 1, that X can assume.
Example 2
Pmf of X is given by:
1 2
𝑝(0) = , 𝑝(1) =
3 3
Then
1 2 2
𝐸[𝑋] = 0 ( ) + 1 ( ) =
3 3 3
Is a weighted average of the 2 possible values, 0 and 1, where the value 1 is given twice as
much weight as the value 0, since 𝑝(1) = 2𝑝(0).
Example 3
Data from example 4.1a
𝒙 𝑷(𝑿 = 𝒙)
0 1
8
1 3
8
2 3
8
3 1
8

The expected value is thus:

𝐸(𝑋) = ∑ 𝑥. 𝑝(𝑥)
𝑥:𝑝(𝑥)>0

1 3 3 1
= 0( ) + 1( ) + 2( ) + 3( )
8 8 8 8
3
= = 1.5
2

The frequency interpretation of probabilities is an alternate way of looking at it. This


interpretation assumes that if an infinite sequence of independent replications of an
experiment are performed, then, for any event E, the proportion of time that E occurs will be
P(E). Considering a random variable X that must take on one of the values x1, x2, …, xn with
respective probabilities p(x1), p(x2),…,p(xn). Think of X as representing our winnings in a single
game of chance. That is, with probability p(xi), we shall win xi units i = 1, 2, …, n. By the
frequency interpretation, if we play this game continually, then the proportion of time that
we win xi will be p(xi). Since this is true for all i, it follows that our average winnings per game
will be:
𝑛

∑ 𝑥𝑖 . 𝑝(𝑥𝑖 ) = 𝐸[𝑋]
𝑖=1
Textbook examples
Example 4.3a
Find E(X), where X is the outcome when we roll a fair die.
Solution:
1
Since 𝑝(1) = 𝑝(2) = 𝑝(3) = 𝑝(4) = 𝑝(5) = 𝑝(6) = , we obtain:
6

1 1 1 1 1 1 𝟕
𝐸(𝑋) = 1 ( ) + 2 ( ) + 3 ( ) + 4 ( ) + 5 ( ) + 6 ( ) =
6 6 6 6 6 6 𝟐
Example 4.3b
We say that I is an indicator variable for the event A if:
1, 𝑖𝑓 𝐴 𝑜𝑐𝑐𝑢𝑟𝑠
𝐼={ }, Find E(I):
0, 𝑖𝑓 𝐴𝑐 𝑜𝑐𝑐𝑢𝑟𝑠
Solution:
Since 𝑝(1) = 𝑃(𝐴), 𝑝(0) = 1 − 𝑃(𝐴), we have:
𝐸(𝐼) = 𝑃(𝐴)
Example 4.3d
A school class of 120 students is driven in 3 buses to a symphonic performance. There are 36
students in one of the buses, 40 in another, and 44 in the third bus. When the buses arrive,
one of the 120 students is randomly chosen. Let X denote the number of students on the bus
of that randomly chosen student, and find E(X).
Solution:
Since a randomly chosen student is equally likely to be any of the 120 students, it follows that
36 3
𝑃{𝑋 = 36} = = ,
120 10
40 1
𝑃{𝑋 = 40} = = ,
120 3
44 11
𝑃{𝑋 = 44} = =
120 30
Hence,
3 1 11 1208
𝐸(𝑋) = 36 ( ) + 40 ( ) = 44 ( ) = = 𝟒𝟎. 𝟐𝟔𝟔𝟕
10 3 30 30
• The average number of students on a bus is 40 (120/3), but since the more students there
are on a bus, the more likely it is that a randomly chosen student would have been on that
bus. As a result, the buses with more students are given more weight than those with less.
Example 4.3d (adapted)
Consider three containers, containing 36, 40 and 44 items respectively. One item is selected
randomly. Let X = the number of items in the container from which the chosen item comes.
Calculate E(X). Now define Y = the number of items in a randomly selected container. What
is E(Y)?
Solution:
Probability distribution of X:
𝒙 𝑷(𝑿 = 𝒙)
36 36
120
40 40
120
44 44
120

Therefore:
3 1 11 1208
𝐸(𝑋) = 36 ( ) + 40 ( ) = 44 ( ) = = 𝟒𝟎. 𝟐𝟔𝟔𝟕
10 3 30 30
Probability distribution of Y:
𝒚 𝑷(𝒀 = 𝒚)
36 1
3
40 1
3
44 1
3

Therefore:
1 1 1
𝐸(𝑋) = 36 ( ) + 40 ( ) = 44 ( ) = 𝟒𝟎
3 3 3
Expectation of a Function of a Random Variable
If you want to compute the expected value of some function of X (discrete random variable),
first compute g(X) and then compute E[g(X)] using the definition of expected value.
Textbook Examples
Example 4.4a
Let X denote a random variable that takes on any of the values -1, 0, and 1 with respective
probabilities:
𝑃{𝑋 = −1} = 0.2, 𝑃{𝑋 = 0} = 0.5, 𝑃{𝑋 = 1} = 0.3
Compute E(X2).
Solution:
Let Y = X2, Then the probability mass function of Y is given by:
𝑃{𝑌 = 1} = 𝑃{𝑋 = −1} + 𝑃{𝑋 = 1} = 0.2 + 0.3 = 0.5
𝑃{𝑌 = 0} = 𝑃{𝑋 = 0} = 0.5
Hence,
𝐸(𝑋 2 ) = 𝐸(𝑌) = 1(0.5) + 0(0.5) = 𝟎. 𝟓
Note:
• 0.5 = 𝐸[𝑋 2 ] ≠ (𝐸[𝑋])2 = 0.1

Proposition 4.1
If X is a discrete random variable that takes on one of the values xi, i ≥ 1, with respective
probabilities p(xi), then, for any real-valued function g,

𝐸[𝑔(𝑋)] = ∑ 𝑔(𝑥𝑖 ). 𝑝(𝑥𝑖 )


𝑖

Applying it the previous example we get:


𝐸(𝑋 2 ) = [(−1)2 (0.2)] + [02 (0.5)] + [12 (0.3)]
= 𝟏(𝟎. 𝟐 + 𝟎. 𝟑) + 𝟎(𝟎. 𝟓)
= 𝟎. 𝟓

Corollary 4.1
If a and b are constants then:
𝐸[𝑎𝑋 + 𝑏] = 𝑎𝐸[𝑋] + 𝑏
Proof

𝐸[𝑎𝑋 + 𝑏] = ∑ (𝑎𝑥 + 𝑏). 𝑝(𝑥)


𝑥:𝑝(𝑥)>0

=𝑎 ∑ 𝑥. 𝑝(𝑥) + 𝑏 ∑ 𝑝(𝑥)
𝑥:𝑝(𝑥)>0 𝑥:𝑝(𝑥)>0

= 𝑎𝐸[𝑋] + 𝑏

nth Non-central moment


Definition
The nth non-central moment of a random variable X is:

𝜇𝑛 = 𝐸(𝑋 𝑛 ) = ∑ 𝑥 𝑛 . 𝑝(𝑥)
𝑥:𝑝(𝑥)>0

Note:
• The expected value of X = the 1st non-central moment of X.
• The expected value of a random variable is referred to as the mean of the random variable
and is often written as μ.

Variance
Distribution of a Random Variable
Properties of distribution that should be attended to:
• Location
o Around what central point are the values of X distributed.
o Measure of location = expected value E(X), of the random variable X.
• Spread
o How widely are the x-values dispersed around a central value.
o Measure of spread = standard deviation and variance

The expected value therefore describes a distribution but nothing about the spread.
Example
𝒙 𝑷(𝑿 = 𝒙)
−4 0.2
2 0.6
8 0.2

𝐸(𝑋) = −4(0.2) + 2(0.6) = 8(0.2) = 𝟐


𝒚 𝑷(𝒀 = 𝒚)
−10 0.2
2 0.6
14 0.2

𝐸(𝑌) = −10(0.2) + 2(0.6) = 14(0.2) = 𝟐


While the random variables X and Y have the same expected value, they clearly are different
distributions as can be seen from the graphical representations of their distributions:

We expect a random variable to take on values around its mean, and therefore it makes
sense to look at how far X is from its mean, on average.

Definition: Variance
If X is a random variable with mean μ, then the variance of X, denoted by Var(X) is defined
by:
𝑽𝒂𝒓(𝑿) = 𝑬[(𝑿 − 𝝁)𝟐 ]
We generally choose to consider the squared difference; that is, 𝐸[(𝑋 − 𝜇)2 ].
In the above example therefore, we can calculate:
𝐸[(𝑋 − 𝜇)2 ] = [(−4 − 2)2 (0.2)] + [(2 − 2)2 (0.6)] + [(8 − 2)2 (0.2)] = 14.4 = 𝑉𝑎𝑟(𝑋)
𝐸[(𝑌 − 𝜇)2 ] = [(−10 − 2)2 (0.2)] + [(0)2 (0.6)] + [(14 − 2)2 (0.2)] = 57.6 = 𝑉𝑎𝑟(𝑌)
Since Var(Y) > Var(X), it indicates that Y is more “spread out” than X.
Definition: Standard Deviation
The standard deviation of a random variable X is the square root of the variance. It is
denoted by:

𝑺𝑫(𝑿) = √𝑽𝒂𝒓(𝑿)
An alternative formula for Var(X) is given by:
𝑽𝒂𝒓(𝑿) = 𝑬[𝑿𝟐 ] − (𝑬[𝑿])𝟐
Derivation of this formula:
𝑉𝑎𝑟(𝑋) = 𝐸[(𝑋 − 𝜇)2 ] (by definition)
𝑉𝑎𝑟(𝑋) = ∑𝑥(𝑥 − 𝜇)2 . 𝑝(𝑥) (proposition 4.1)

𝑉𝑎𝑟(𝑋) = ∑(𝑥 2 − 2𝑥𝜇 + 𝜇2 ). 𝑝(𝑥)


𝑥

= ∑ 𝑥 2 . 𝑝(𝑥) − 2𝜇 ∑ 𝑥. 𝑝(𝑥) + 𝜇2 ∑ 𝑝(𝑥)


𝑥 𝑥 𝑥

E(X2) E(X)= μ =1 (property of pmf)

= 𝐸(𝑋 2 ) − 2𝜇. 𝜇 + 𝜇2
= 𝐸(𝑋 2 ) − 2𝜇2 + 𝜇2
= 𝐸(𝑋 2 ) − 𝜇2
= 𝐸(𝑋 2 ) − [𝐸(𝑋)]2

Identity
𝑽𝒂𝒓(𝒂𝑿 + 𝒃) = 𝒂𝟐 𝑽𝒂𝒓(𝑿), for constants a and b and a random variable X.
E[X+Y]=E[X]+E[Y], if x and y are independent
Var[X+Y]=Var[X]+Var[Y], if x and y are independent
Var(x+y)=Var(x)+Var(y)-2cov(x,y), if independence does not hold
Proof
𝑉𝑎𝑟(𝑎𝑋 + 𝑏) = 𝑉𝑎𝑟(𝑎𝑋) + 𝑉𝑎𝑟(𝑏)
= 𝑎2 𝑉𝑎𝑟(𝑋) + 0
Note:
• Var[constant] = 0
• E[constant] = c
• 𝐸[𝑎𝑋 + 𝑏] = 𝐸[𝑎𝑋] + 𝐸[𝑏] = 𝑎𝐸[𝑋] + 𝑏
Textbook Examples
Example 4.5a
Calculate Var(X) if X represents the outcome when a fair die is rolled.
𝒙 𝑷(𝑿 = 𝒙)
1 1
6
2 1
6
3 1
6
4 1
6
5 1
6
6 1
6

1 1 1 1 1 1 𝟕
𝐸[𝑋] = 1 ( ) + 2 ( ) + 3 ( ) + 4 ( ) + 5 ( ) + 6 ( ) =
6 6 6 6 6 6 𝟐
1 1 1 1 1 1 𝟗𝟏
𝐸[𝑋 2 ] = 12 ( ) + 22 ( ) + 32 ( ) + 42 ( ) + 52 ( ) + 62 ( ) =
6 6 6 6 6 6 𝟔
𝑉𝑎𝑟(𝑋) = 𝐸[𝑋 2 ] − (𝐸[𝑋])2
91 7 2 𝟑𝟓
= −( ) =
6 2 𝟏𝟐
Additional Problem
Example 1
A salesman has scheduled two appointments to sell encyclopedias. His first appointment will
lead to a sale with probability 0.3, and his second will lead independently to a sale with
probability 0.6. Any sale made is equally likely to be either for the deluxe model, which costs
$1000, or the standard model, which costs $500. Determine the probability mass function
of X, the total dollar value of all sales.
Solution:
Appointment Sales Value ($) Probability
1 0 0 0.7
Standard model 500 0.3(0.5) = 0.15
Deluxe model 1 000 0.3(0.5) = 0.15
2 0 0 0.4
Standard model 500 0.6(0.5) = 0.3
Deluxe model 1 000 0.6(0.5) = 0.3
The distribution of X is therefore as follows:
𝑃(𝑋 = 0) = (0.7)(0.4) = 0.28
𝑃(𝑋 = 500) = (0.15)(0.4) + (0.7)(0.3) = 0.27
𝑃(𝑋 = 1 000) = (0.15)(0.4) + (0.7)(0.3) + (0.15)(0.3) = 0.315
𝑃(𝑋 = 1 500) = (0.15)(0.3) + (0.15)(0.3) = 0.09
𝑃(𝑋 = 2 000) = (0.15)(0.3) = 0.045
𝟎. 𝟐𝟖𝟎 𝒙=𝟎
𝟎. 𝟐𝟕𝟎 𝒙 = 𝟓𝟎𝟎
𝒑(𝒙) = 𝟎. 𝟑𝟏𝟓 𝒙 = 𝟏 𝟎𝟎𝟎
𝟎. 𝟎𝟗𝟎 𝒙 = 𝟏 𝟓𝟎𝟎
{𝟎. 𝟎𝟒𝟓 𝒙 = 𝟐 𝟎𝟎𝟎}
Example 2
Four independent flips of a fair coin are made. Let X denote the number of heads obtained.
Plot the probability mass function of the random variable X – 2.
Solution:
Sample space = {(T,T,T,T), (T,T,T,H), (T,T,H,H), (T,H,H,H), (H,H,H,H), (H,H,H,T), (H,H,T,T),
(H,T,T,T), (H,T,H,T), (T,H,T,H), (T,T,H,T), (H,T,T,T), (H,H,T,H), (T,H,H,H), (H,T,H,H), (T,H,T,T)}
X Y=X–2 P(Y)
0 -2 1/16
1 -1 4/16
2 0 6/16
3 1 4/16
4 2 1/16
Example 3
Two coins are to be flipped. The first coin will land on heads with probability 0.6, the second
with probability 0.7. Assume that the results of the flips are independent, and let X equal the
total number of heads that result.
a) Find P(X = 1)

Solution:
𝑃(𝑋 = 1) = 𝑃[(𝐻, 𝑇)] ∪ 𝑃[(𝑇, 𝐻)]
= (0.6)(0.3) + (0.4)(0.7) = 𝟎. 𝟒𝟔
b) Determine E(X)

Solution:
𝑃(𝑋 = 0) = 𝑃[(𝑇, 𝑇)] = (0.4)(0.3) = 0.12
𝑃(𝑋 = 1) = 𝑃[(𝐻, 𝑇)] ∪ 𝑃[(𝑇, 𝐻)] = 0.46
𝑃(𝑋 = 2) = 𝑃[(𝐻, 𝐻)] = (0.6)(0.7) = 0.42
𝑬(𝑿) = 0(0.12) + 1(0.46) + 2(0.42) = 𝟏. 𝟑
Example 4
A sample of 3 items is selected at random from a box containing 20 items of which 4 are
defective. Find the expected number of defective items in the sample.
Solution:
Let X = number of defective items in the sample space
The distribution of X is as follows:
x Calculation p(x)
0 (4𝐶0 )(16𝐶3 ) 560
= 0.491
20𝐶3 1 140
1 (4𝐶1 )(16𝐶2 ) 480
= 0.421
20𝐶3 1 140
2 (4𝐶2 )(16𝐶1 ) 96
= 0.084
20𝐶3 1 140
3 (4𝐶3 )(16𝐶0 ) 4
= 0.0035
20𝐶3 1 140

𝑬(𝑿) = 0(0.491) + 1(0.421) + 2(0.084) + 3(0.0035) = 𝟎. 𝟔


3
480 + 96 + 4
𝑬(𝑿) = ∑ 𝑥. 𝑝(𝑥) = = 𝟎. 𝟔
1 140
0
Example 5
A box contains 5 red and 5 blue marbles. Two marbles are withdrawn randomly. If they are
the same colour, then you win $1.10; if they are different colours, then you win -$1.00. (That
is, you lose $1.00.) Calculate
(a) The expected value of the amount you win

Solution:
Let X = amount you win
The distribution of X is as follows:
Outcome Gain = x Calculation Probability
RR/BB 1.10 (5𝐶2 )(5𝐶0 ) + (5𝐶0 )(5𝐶2 ) 4
10𝐶2 9
RB/BR -1.00 (5𝐶1 )(5𝐶1 ) 5
10𝐶2 9

4 5
𝑬(𝑿) = (1.10) ( ) + (−1.00) ( ) = −𝟎. 𝟎𝟔𝟔𝟕
9 9
(b) The variance of the amount you win.

Solution:
Let X = amount you win
4 5
𝑬(𝑿𝟐 ) = (1.10)2 ( ) + (−1.00)2 ( ) = 𝟏. 𝟎𝟗𝟑𝟑
9 9
𝑽𝒂𝒓(𝑿) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2
= 1.0933 − (−0.0667)2
= 𝟏. 𝟎𝟖𝟖𝟗
Note:
• If X has distribution function , he distribution function of eX is:
• Let Y = ex

𝐹(𝑋) = 𝑃(𝑋 ≤ 𝑥)
𝐹𝑦 (𝑌) = 𝑃(𝑌 ≤ 𝑦)
𝐹𝑦 (𝑌) = 𝑃(𝑒 𝑥 ≤ 𝑦)
𝐹𝑦 (𝑌) = 𝑃(𝑋 ≤ ln (𝑦))
𝐹𝑦 (𝑌) = 𝐹𝑥 (ln (𝑦)
Distributions
We will consider 5 specific discrete probability distributions
In each case we will especially consider the following aspects:
(i) Under what circumstances, i.e. in what situations, can the distribution be used?
(ii) The pmf of the distribution
(iii) The parameters of the distribution, and interpretation of them in a practical situation
(iv) What relationships are there between the distributions?
(v) What are the expected value and the variance of the distribution?
(vi) Special properties of the distribution

Binomial Distribution: 𝑿~𝒃𝒊𝒏𝒐𝒎(𝒏, 𝒑)


When can the binomial distribution be used?
• We have an experiment with two possible outcomes, called success and failure.
• This experiment is repeated n times under the same conditions.
• (ii) The repetitions (trials) are independent.
• (iii) P(success) = p remains constant throughout all the repetitions.

If these requirements are met, the random variable X is defined as X = the number of
successes in the n repetitions. The probability distribution of X is a binomial distribution with
parameters n and p. This is denoted by 𝑿~𝒃𝒊𝒏𝒐𝒎(𝒏, 𝒑)
If n = 1 we call the binomial distribution a Bernoulli distribution.
The pmf of the binomial distribution is given by:
𝒏
𝒑(𝒙) = ( ) 𝒑𝒙 (𝟏 − 𝒑)𝒏−𝒙 , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, … , 𝒏
𝒙
Note:
𝑛
𝒏
∑ ( ) 𝒑𝒙 (𝟏 − 𝒑)𝒏−𝒙 = 1
𝒙
𝑥=0

Derivation of the pmf of a binomial distribution:


Textbook Examples
Example 4.6a
Five fair coins are flipped. If the outcomes are assumed independent, find the probability
mass function of the number of heads obtained.
Solution:
Let X = number of heads obtained (successes), where X can be 0, 1, 2, 3, 4, 5
Then X is a binomial random variable with parameters (n = 5, p = 0.5): 𝑋~𝑏𝑖𝑛𝑜𝑚(5, 0.5)
We obtain the probability of obtaining exactly x heads in the 5 flips as:
(𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑎𝑦𝑠 𝑡𝑜 𝑔𝑒𝑡 𝒙 ℎ𝑒𝑎𝑑𝑠 𝑖𝑛 5 𝑐𝑜𝑖𝑛 𝑓𝑙𝑖𝑝𝑠) × 𝑃(𝑥 ℎ𝑒𝑎𝑑𝑠) × 𝑃(5 − 𝑥 𝑡𝑎𝑖𝑙𝑠)
x p(x)
0
0 5 1 1 5 1
( )( ) ( ) =
0 2 2 32
1
1 5 1 1 4 5
( )( ) ( ) =
1 2 2 32
2
2 5 1 1 3 10
( ) ( ) ( ) =
2 2 2 32
3
3 5 1 1 2 10
( )( ) ( ) =
3 2 2 32
4
4 5 1 1 1 5
( )( ) ( ) =
4 2 2 32
5
5 5 1 1 0 1
( )( ) ( ) =
5 2 2 32

Now let’s consider this example as a binomial distribution:


• There are 2 possible outcomes (i.e. heads or tails)
• There is a fixed number of repetitions, 5 (i.e. n = 5)
• The repetitions are independent (i.e. the outcome of one coin flip does not influence the
outcome of any other coin flip)
• P(success) = P(heads) = 0.5

Therefore the pmf for X is:


𝟓
𝑷(𝑿 = 𝒙) = ( ) 𝟎. 𝟓𝒙 (𝟎. 𝟓)𝟓−𝒙 , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, 𝟑, 𝟒, 𝟓
𝒙
Example 4.6b
It is known that screws produced by a certain company will be defective with probability 0.01,
independently of one another. The company sells the screws in packages of 10 and offers a
money-back guarantee that at most 1 of the 10 screws is defective. What proportion of
packages sold must the company replace?
Solution:
Check if this is a binomial experiment:
• There are 2 possible outcomes (i.e. defective or not defective)
• There is a fixed number of repetitions, 10 (i.e. n = 10)
• The repetitions are independent (the question states this explicitly)
• P(success) = P(defective) = 0.01

Therefore X = number of defective screws (number of successes in this example)


Therefore 𝑋~𝑏𝑖𝑛𝑜𝑚(10, 0.01)
Thus:
𝑃(𝑝𝑎𝑐𝑘𝑎𝑔𝑒 𝑖𝑠 𝑟𝑒𝑝𝑙𝑎𝑐𝑒𝑑) = 𝑃(𝑚𝑜𝑟𝑒 𝑡ℎ𝑎𝑛 1 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑠𝑐𝑟𝑒𝑤 𝑖𝑛 𝑝𝑎𝑐𝑘𝑎𝑔𝑒)
= 𝑃(𝑋 > 1)
= 1 − 𝑃(𝑋 ≤ 1)
= 1 − [𝑃(𝑋 = 0) + 𝑃(𝑋 = 1)]

10 10
= 1 − [(( ) 0.010 (0.99)10 ) + (( ) 0.011 (0.99)9 )]
0 1
= 1 − [0.9044 + 0.0914] = 𝟎. 𝟎𝟎𝟒𝟐
Example 4.6b extension
Suppose someone buys 3 packages of screws. What is the probability that exactly 1 package
is brought back for replacement?
Solution:
In this case, we can see the purchase of 3 packages as a (different) binomial experiment.
If we define Y = the number of packages returned (out of the 3 purchased), then Y follows a
binomial distribution with n = 3. Success in this case is seen as the return of a package, which
means that P(success) = P(package is returned/replaced) = 0.0042.
Therefore, 𝑌 ~ 𝑏𝑖𝑛𝑜𝑚(3; 0.0042).
Thus:
3
𝑃(𝑌 = 1) = ( ) 0.00421 (1 − 0.0042)3−1 = 𝟎. 𝟎𝟏𝟐𝟓
1
Additional Examples
Example 4.1
On a multiple-choice exam with 5 possible answers for each of the 15 questions on the exam,
what is the probability that a student will pass the exam if he/she randomly guesses the
answers?
Let X = number of correct guesses
Then 𝑋 ~ 𝑏𝑖𝑛𝑜𝑚(𝑛 = 15; 𝑝 = 0.2)
The pmf of X is given by:
𝟏𝟓
𝒑(𝒙) = ( ) 𝟎. 𝟐𝒙 (𝟎. 𝟖)𝟏𝟓−𝒙 , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, … , 𝟏𝟓
𝒙
To pass the exam, they need to get at least 50% which implies at least 8 correct answers.
Therefore:
𝑃(𝑝𝑎𝑠𝑠𝑒𝑠) = 𝑃(𝑋 ≥ 8)
= 𝑃(𝑋 = 8) + 𝑃(𝑋 = 9) + 𝑃(𝑋 = 10) + ⋯ + 𝑃(𝑋 = 13) + 𝑃(𝑋 = 14) + 𝑃(𝑋 = 15)
Since that would require a lot of tedious work, we make use of tables for binomial
distribution, which allows use to look up the probabilities we require. In these tables we
make use of n (number of repetitions), a (number of successes) and π (probability of success).
in This example, n = 15, and π = 0.2 and we thus look in the row π = 0.2 & the column π = 0.2

The table gives cumulative probabilities. If we look at the row where a = 7: 𝑃(𝑋 ≤ 7)
Therefore for 𝑋 ~ 𝑏𝑖𝑛𝑜𝑚(𝑛 = 15; 𝑝 = 0.2), 𝑃(𝑋 ≤ 7) = 0.9958 and thus:
𝑃(𝑋 ≥ 8) = 1 − 𝑃(𝑋 ≤ 7)
= 1 − 0.9958
= 𝟎. 𝟎𝟎𝟒𝟐
Example 4.2
Suppose we have a biased coin, for which the probability of obtaining heads is 0.4. Suppose
this coin is tossed 20 times. Therefore X = number of heads in the 20 tosses of the coin.
This is a binomial distribution, therefore: 𝑋 ~ 𝑏𝑖𝑛𝑜𝑚(𝑛 = 20; 𝑝 = 0.4)
Calculate:
(a) P(at most 6 heads)

𝑃(𝑋 ≤ 6) = 𝟎. 𝟐𝟓𝟎𝟎
This is taken from the binomial cumulative probability table

(b) P(at least 9 heads)

𝑃(𝑋 ≥ 9) = 1 − 𝑃(𝑋 ≤ 8)
= 1 − 0.5956
= 𝟎. 𝟒𝟎𝟒𝟒

(c) P(more than 7 heads)

𝑃(𝑋 > 7) = 1 − 𝑃(𝑋 ≤ 7)


= 1 − 0.4159
= 𝟎. 𝟓𝟖𝟒𝟏
(d) P(obtaining less than 12 heads given that less than 14 heads were obtained)

𝑃(𝑋 < 12 ∩ 𝑋 < 14)


𝑃(𝑋 < 12|𝑋 < 14) =
𝑃(𝑋 < 14)
𝑃(𝑋 < 12)
=
𝑃(𝑋 < 14)
𝑃(𝑋 ≤ 11)
=
𝑃(𝑋 ≤ 13)
0.9435
=
0.9935
= 𝟎. 𝟗𝟒𝟗𝟕

(e) P(at most 8 tails)

Solution 1: P(at most 8 tails) = P(at least 12 heads)


𝑃(𝑋 ≥ 12) = 1 − 𝑃(𝑋 ≤ 11)
= 1 − 0.9435
= 𝟎. 𝟎𝟓𝟔𝟓
Solution 2: Define new random variable Y = number of tails in 20 coin tosses
𝑌 ~ 𝑏𝑖𝑛𝑜𝑚(𝑛 = 20; 𝑝 = 0.6)
If you look at the formula at the top of the binomial tables:
𝑎
𝒏
𝐹𝑥 (𝑎) = ∑ ( ) 𝝅𝒙 (𝟏 − 𝝅)𝒏−𝒙
𝒙
𝑥=0

For π > 0.5


𝑎 𝑛−𝑎−1
𝒏 𝒏
∑ ( ) 𝝅𝒙 (𝟏 − 𝝅)𝒏−𝒙 = 1 − ∑ ( ) (𝟏 − 𝝅)𝒙 (𝝅)𝒏−𝒙
𝒙 𝒙
𝑥=0 𝑥=0
Therefore for our example: 𝑃(𝑌 ≤ 8)
8 20−8−1
𝟐𝟎 𝟐𝟎
∑ ( ) 𝟎. 𝟔𝒚 (𝟏 − 𝟎. 𝟔)𝟐𝟎−𝒚 = 1 − ∑ ( ) (𝟏 − 𝟎. 𝟔)𝒚 (𝟎. 𝟔)𝟐𝟎−𝒚
𝒚 𝒚
𝑦=0 𝑦=0
11
𝟐𝟎
= 1 − ∑ ( ) (𝟎. 𝟒)𝒚 (𝟎. 𝟔)𝟐𝟎−𝒚
𝒚
𝑦=0

Using this formula we see that we can find 𝑃(𝑌 ≤ 11) with probability 0.4 (instead of 0.6).
From the table, this is the probability 0.9435 and the answer is therefore the same as sol 1.

Expected Value & Variance of Binomial Distribution


The expected value 𝐸(𝑋) is the first non-central value and 𝐸(𝑋 2 ) which is used to calculate
the variance is the second non-central value.
𝑬(𝑿) = 𝒏𝒑
𝑽𝒂𝒓(𝑿) = 𝑛𝑝(1 − 𝑝) = 𝒏𝒑𝒒
Proof
Combinatorial Identity
Required to prove: (RTP)
𝑛 𝑛−1
𝑥( ) = 𝑛( )
𝑥 𝑥−1
Proof:
𝑛 𝑛!
𝐿𝐻𝑆 = 𝑥 ( ) = 𝑥 ∙
𝑥 𝑥! (𝑛 − 𝑥)!
𝑥 ∙ 𝑛!
=
𝑥(𝑥 − 1)! (𝑛 − 𝑥)!
𝑛!
=
(𝑥 − 1)! (𝑛 − 𝑥)!
𝑛(𝑛 − 1)!
=
(𝑥 − 1)! (𝑛 − 𝑥)!
(𝑛 − 1)!
=𝑛∙
(𝑛 − 𝑥)! (𝑥 − 1)!
(𝑛 − 1)!
=𝑛∙
[𝑛 − 1 − (𝑥 − 1)]! (𝑥 − 1)!
𝑛−1
= 𝑛( ) = 𝑅𝐻𝑆
𝑥−1
Non-central moments of X ~ binomial (n, p)
𝑛
𝑛
𝐸(𝑋 𝑘 ) = ∑ 𝑥 𝑘 ( ) 𝑝 𝑥 . (1 − 𝑝)𝑛−𝑥
𝑥
𝑥=0
𝑛
𝑛 𝑛
= 0𝑘 ( ) 𝑝0 . (1 − 𝑝)𝑛−0 + ∑ 𝑥 𝑘 ( ) 𝑝 𝑥 . (1 − 𝑝)𝑛−𝑥
0 𝑥
𝑥=1
𝑛
𝑛
= ∑ 𝒙𝒌 ( ) 𝑝 𝑥 . (1 − 𝑝)𝑛−𝑥
𝑥
𝑥=1
𝑛
𝑛
= ∑ 𝒙𝒌−𝟏 . 𝒙. ( ) 𝑝 𝑥 . (1 − 𝑝)𝑛−𝑥 (𝒙𝒌−𝟏 . 𝒙 = 𝒙𝒌 )
𝑥
𝑥=1
𝑛
𝑛−1 𝑥 𝑛 𝑛−1
= ∑ 𝑥 𝑘−1 . 𝑛 ( ) 𝑝 . (1 − 𝑝)𝑛−𝑥 (𝑥. ( ) = 𝑛 ( ))
𝑥−1 𝑥 𝑥−1
𝑥=1
𝑛
𝑛 − 1 𝑥−1
= ∑ 𝑥 𝑘−1 . 𝑛 ( ) 𝑝 . 𝑝. (1 − 𝑝)𝑛−𝑥 (𝒑𝒙−𝟏 . 𝒑 = 𝒑𝒙 )
𝑥−1
𝑥=1
𝑛
𝑛 − 1 𝑥−1
= 𝑛𝑝 ∑ 𝑥 𝑘−1 ( ) 𝑝 (1 − 𝑝)𝑛−𝑥
𝑥−1
𝑥=1

Let:
𝑦 =𝑥−1⇒𝑥 =𝑦+1
Therefore:
𝑛−1
𝑛−1 𝑦
𝐸(𝑋 𝑘 ) = 𝑛𝑝 ∑ (𝑦 + 1)𝑘−1 ( ) 𝑝 . (1 − 𝑝)𝑛−𝑦−1
𝑦
𝑦=0

= 𝑛𝑝. 𝐸[(𝑌 + 1)𝑘−1 ]


Where
𝑌 ~ 𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙 (𝑛 − 1, 𝑝)
Expected value and Variance of X ~ binomial (n, p)
𝐸(𝑋 𝑘 ) = 𝑛𝑝. 𝐸[(𝑌 + 1)𝑘−1 ], Where 𝑌 ~ 𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙 (𝑛 − 1, 𝑝)

For k = 1:
𝐸(𝑋) = 𝑛𝑝. 𝐸[(𝑌 + 1)1−1 ]
= 𝑛𝑝. 𝐸[(𝑌 + 1)0 ]
= 𝑛𝑝. 𝐸(1)
= 𝒏𝒑
For k = 2:
𝐸(𝑋 2 ) = 𝑛𝑝. 𝐸[(𝑌 + 1)2−1 ]
= 𝑛𝑝. 𝐸[(𝑌 + 1)1 ]
= 𝑛𝑝. [𝐸(𝑌) + 𝐸(1)]
= 𝑛𝑝. [(𝑛 − 1)𝑝 + 1] (Since 𝑌 ~ 𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙 (𝑛−)1, 𝑝)

= 𝑛2 𝑝2 − 𝑛𝑝2 + 𝑛𝑝

𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2


= [𝑛2 𝑝2 − 𝑛𝑝2 + 𝑛𝑝] − [𝑛𝑝]2
= 𝑛𝑝 − 𝑛𝑝2
= 𝒏𝒑(𝟏 − 𝒑)
Poisson Distribution: 𝑿~𝑷𝒐(𝝀)
When can the Poisson distribution be used?
• It provides a (good) approximation to the binomial distribution if the number of
repetitions (n) becomes large, and the probability to obtain a success (p) becomes small,
in such a way that np remains moderately large.
o λ = np
o λ is a rate parameter
o binomial: 𝒏
⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗
→ ∞ poisson
• If events occur in time in such a way that the following requirements are satisfied:
o The probability of one event in a short interval of length h is proportional to h, i.e.
it is λh.
o The probability of 2 or more events in a short interval is zero.
o The number of events occurring in disjoint time intervals are independent random
variables.

Formally, suppose X = the number of events occurring per unit of time according to a Poisson
distribution. We denote this by 𝑿~𝑷𝒐(𝝀) where λ is the parameter of the distribution.
The pmf of X is given by:

−𝝀
𝝀𝒙
𝒑(𝒙) = 𝑷(𝑿 = 𝒙) = 𝒆 ∙ , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, …
𝒙!
The interpretation of the parameter λ is the average number of events occurring per unit of
time (or space).

Expected Value & Variance of Poisson Distribution


𝐸(𝑋) = 𝝀
𝑉𝑎𝑟(𝑋) = 𝝀
Proof
Expected Value

−𝝀
𝝀𝒙
𝒑(𝒙) = 𝑷(𝑿 = 𝒙) = 𝒆 ∙ , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, …
𝒙!

𝑬(𝑿) = ∑ 𝒙. 𝒑(𝒙)
𝒙:𝒑(𝒙)>𝟎

−𝝀
𝝀𝒙
= ∑ 𝒙. (𝒆 ∙ )
𝒙!
𝒙=𝟎

−𝝀
𝝀𝟎 −𝝀
𝝀𝒙
= 𝟎. 𝒆 ∙ + ∑ 𝒙. (𝒆 ∙ )
𝟎! 𝒙!
𝒙=𝟏

−𝝀
𝝀𝒙
= ∑ 𝒙. (𝒆 ∙ )
𝒙!
𝒙=𝟏

−𝝀
𝝀𝒙 . 𝒙
= ∑ (𝒆 ∙ )
𝒙(𝒙 − 𝟏)!
𝒙=𝟏

−𝝀
𝝀𝒙
= ∑ (𝒆 ∙ )
(𝒙 − 𝟏)!
𝒙=𝟏

𝝀. 𝝀𝒙−𝟏
= ∑ (𝒆−𝝀 ∙ )
(𝒙 − 𝟏)!
𝒙=𝟏

−𝝀
𝝀𝒙−𝟏
= 𝒆 .𝝀∑( )
(𝒙 − 𝟏)!
𝒙=𝟏

−𝝀
𝝀𝒚
= 𝒆 .𝝀∑( ) (Let 𝑦 = 𝑥 − 1)
𝒚!
𝒚=𝟎

= 𝒆−𝝀 . 𝝀. 𝒆𝝀 (Maclaurin series expansion)


=𝝀
Variance

𝟐) 𝟐 −𝝀
𝝀𝒙
𝑬(𝑿 = ∑ 𝒙 (𝒆 ∙ )
𝒙!
𝒙=𝟎

𝟐 −𝝀
𝝀𝟎 𝟐 −𝝀
𝝀𝒙
= 𝟎 . 𝒆 ∙ + ∑ 𝒙 (𝒆 ∙ )
𝟎! 𝒙!
𝒙=𝟏

𝟐 −𝝀
𝝀𝒙
= ∑ 𝒙 (𝒆 ∙ )
𝒙!
𝒙=𝟏

−𝝀
𝝀𝒙 . 𝒙
= ∑ 𝒙 (𝒆 ∙ )
𝒙(𝒙 − 𝟏)!
𝒙=𝟏

−𝝀
𝝀𝒙
= ∑ 𝒙 (𝒆 ∙ )
(𝒙 − 𝟏)!
𝒙=𝟏

−𝝀
𝝀. 𝝀𝒙−𝟏
= ∑ 𝒙 (𝒆 ∙ )
(𝒙 − 𝟏)!
𝒙=𝟏

𝒙. 𝒆−𝝀 . 𝝀𝒙−𝟏
= 𝝀∑( )
(𝒙 − 𝟏)!
𝒙=𝟏

(𝒚 + 𝟏). 𝒆−𝝀 . 𝝀𝒚
= 𝝀∑( ) (Let 𝑦 = 𝑥 − 1)
𝒚!
𝒚=𝟎

∞ ∞
(𝒚). 𝒆−𝝀 . 𝝀𝒚 (𝟏). 𝒆−𝝀 . 𝝀𝒚
= 𝝀 [∑ ( )+∑( )]
𝒚! 𝒚!
𝒚=𝟎 𝒚=𝟎

= 𝝀(𝝀 + 𝟏) (=𝜆 (expected value) & = 1 (pmf))

𝑽𝒂𝒓(𝒙) = 𝑬(𝑿𝟐 ) − [𝑬(𝑿)]𝟐


= 𝝀(𝝀 + 𝟏) − (𝝀)𝟐
= 𝝀𝟐 + 𝝀 − 𝝀𝟐
=𝝀
Textbook Examples
Example 4.7a
Suppose that the number of typographical errors on a single page of this book has a Poisson
distribution with parameter λ = 0.5. Calculate the probability:
(a) of at least 1 typographical error occurring on a single page.

Let: 𝑋 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟𝑠 𝑜𝑛 𝑝𝑎𝑔𝑒


Then: 𝑿~𝑷𝒐(𝝀 = 𝟎. 𝟓)
Where: 𝝀 = 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟𝑠 𝑜𝑐𝑐𝑢𝑟𝑟𝑖𝑛𝑔 𝑝𝑒𝑟 𝑢𝑛𝑖𝑡 (𝑠𝑝𝑎𝑐𝑒 𝑖𝑛 𝑡ℎ𝑖𝑠 𝑐𝑎𝑠𝑒)
Therefore the pmf of X is:

−𝟎.𝟓
𝟎. 𝟓𝒙
𝒑(𝒙) = 𝒆 ∙ , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, …
𝒙!
Thus to answer the question:
𝑃(𝑋 ≥ 1) = 1 − 𝑃(𝑋 < 1)
= 1 − 𝑃(𝑋 = 0)

−0.5
0.50
=1−𝑒 ∙
0!
= 1 − 𝑒 −0.5
= 1 − 0.6065
= 𝟎. 𝟑𝟗𝟑𝟓
(b) of finding a total of 4 typographical errors on 3 pages.

Let: 𝑋 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟𝑠 𝑜𝑛 3 𝑝𝑎𝑔𝑒𝑠


Then: 𝑿~𝑷𝒐(𝝀 = 𝟎. 𝟓 × 𝟑 = 𝟏. 𝟓)
Where: 𝝀 = 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟𝑠 𝑜𝑐𝑐𝑢𝑟𝑟𝑖𝑛𝑔 𝑝𝑒𝑟 3 𝑝𝑎𝑔𝑒𝑠
Therefore the pmf of X is:

−𝟏.𝟓
𝟏. 𝟓𝒙
𝒑(𝒙) = 𝒆 ∙ , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, …
𝒙!
Thus to answer the question:

−1.5
1.54
𝑃(𝑋 = 4) = 𝑒 ∙
4!
= 𝟎. 𝟎𝟒𝟕𝟏
(c) of finding at least 1 error on 6 out of the next 10 pages.

In this case we can view every page as a “repetition” in a binomial distribution. From part
(a), we know that P(at least 1 error per page) = 0.3935
Let: 𝑌 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑎𝑔𝑒𝑠 (𝑜𝑢𝑡 𝑜𝑓 10) 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 1 𝑒𝑟𝑟𝑜𝑟
Then: 𝒀~𝒃𝒊𝒏𝒐𝒎(𝒏 = 𝟏𝟎; 𝒑 = 𝟎. 𝟑𝟗𝟑𝟓)
Therefore the pmf of Y is:
𝟏𝟎
𝒑(𝒀) = ( ) (𝟎. 𝟑𝟗𝟑𝟓)𝒚 (𝟏 − 𝟎. 𝟑𝟗𝟑𝟓)𝟏𝟎−𝒚 , 𝒇𝒐𝒓 𝒚 = 𝟎, 𝟏, 𝟐, … , 𝟏𝟎
𝒚
Thus to answer the question:
10
𝑃(𝑌 = 6) = ( ) (0.3935)6 (1 − 0.3935)10−6
6
= 𝟎. 𝟏𝟎𝟓𝟓
Example 4.7b
Suppose that the probability that an item produced by a certain machine will be defective is
0.1. Find the probability that a sample of 10 items will contain at most 1 defective item.
Solution:
Binomial distribution
Let: 𝑋 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒 𝑜𝑓 10 𝑖𝑡𝑒𝑚𝑠
Then: 𝑿~𝑩𝒊𝒏𝒐𝒎(𝒏 = 𝟏𝟎, 𝒑 = 𝟎. 𝟏)
𝑃(𝑋 ≤ 1) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1)
10 10
= ( ) (0.1)0 (0.9)10−0 + ( ) (0.1)1 (0.9)10−1
0 1
= 𝟎. 𝟕𝟑𝟔𝟏
Poisson distribution
Let: 𝑋 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒 𝑜𝑓 10 𝑖𝑡𝑒𝑚𝑠
Then: 𝑿~𝑷𝒐(𝝀 = 𝟏)
Where: 𝝀 = 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑖𝑡𝑒𝑚𝑠 𝑝𝑒𝑟 𝑢𝑛𝑖𝑡 (𝑠𝑎𝑚𝑝𝑙𝑒 𝑜𝑓 10)
Therefore, the pmf of X is:

−𝟏
𝟏𝒙
𝒑(𝒙) = 𝒆 ∙ , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, …
𝒙!
Thus to answer the question:
−1
10 −1
11
𝑃(𝑋 ≤ 1) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) = 𝑒 ∙ + 𝑒 ∙ = 𝟎. 𝟕𝟑𝟓𝟖
0! 1!
Example 4.7c
Consider an experiment that consists of counting the number of α particles given off in a
1-second interval by 1 gram of radioactive material. If we know from past experience that on
the average, 3.2 such α particles are given off, what is a good approximation to the
probability that no more than 2 α particles will appear?
Solution:
If we think of the gram of radioactive material as consisting of a large number n of atoms,
each of which has probability of 3.2/n of disintegrating and sending off an particle during the
second considered, then we see that to a very close approximation, the number of particles
given off will be a Poisson random variable with parameter λ = 3.2. Hence, the desired
probability is:
𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)

−3.2 −3.2
(3.2)2 −3.2
=𝑒 + 3.2𝑒 + 𝑒 = 𝟎. 𝟑𝟕𝟗𝟗
2
Additional Examples
Example 1
A certain defect occurs in 1% of the items manufactured in a factory. Consider 1 000 items
selected randomly from the factory production.
Let: 𝑋 = 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒.

(a) According to the binomial distribution, what is the probability mass function of X?

1 000
𝑝(𝑥) = ( ) (0.01)𝑥 (0.99)1 000−𝑥 , 𝑓𝑜𝑟 𝑥 = 0, 1, 2, … ,1 000
𝑥
(b) Use the binomial probability mass function and calculate the probability of finding 8
defective items in the sample.
𝑃(𝑋 = 8) = 𝑝(8)
1 000
=( ) (0.01)8 (0.99)1 000−8
8
= 𝟎. 𝟏𝟏𝟐𝟖
(c) According to the Poisson distribution, what is the probability mass function of X?

λ = np = (1 000)(0.01) = 𝟏𝟎
That is, n is large and p is small, with np moderately large, so we can use the Poisson
distribution as an approximation to the binomial distribution. Therefore,

−10
10𝑥
𝑝(𝑥) = 𝑒 ∙ , 𝑓𝑜𝑟 𝑥 = 0, 1, 2, …
𝑥!
(d) Use the Poisson probability mass function and calculate the probability of finding 8
defective items in the sample.

𝑃(𝑋 = 8) = 𝑝(8)

−10
108
=𝑒 ∙
8!
= 𝟎. 𝟏𝟏𝟐𝟔

Example 2
Suppose the number of claims at a short-term insurer can be described by a Poisson
distribution with average 5 per day.
Let: 𝑋 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑖𝑚𝑠 𝑝𝑒𝑟 𝑑𝑎𝑦
Then: 𝑋~𝑃𝑜(𝜆 = 5)
(a) Write down the probability mass function of the number of claims received per day.

−5
5𝑥
𝑝(𝑥) = 𝑒 ∙ , 𝑓𝑜𝑟 𝑥 = 0, 1, 2, …
𝑥!
(b) Determine the probability that at most 4 claims are received on any given day.

25 125 625
𝑃(𝑋 ≤ 4) = 𝑝(0) + 𝑝(1) + 𝑝(2) + 𝑝(3) + 𝑝(4) = 𝑒 −5 (1 + 5 + + + )
2 6 24
= 𝟎. 𝟒𝟒𝟎𝟓
One can also use the cumulative probability tables to find these values:
(c) Determine the probability that at most 20 claims are received in a 4-day week.

5 claims in one day imply (5 × 4) = 20 claims in 4 days.


Let: 𝑌 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑖𝑚𝑠 𝑖𝑛 4 𝑑𝑎𝑦𝑠
Then: 𝒀 ~ 𝑷𝒐(𝝀 = 𝟐𝟎).
Thus by using the table we find:
𝑃(𝑌 ≤ 20) = 𝟎. 𝟓𝟓𝟗𝟏

(d) Determine the probability that at least 7 claims are received in a period of 3 days.

5 claims in one day imply (5 × 3) = 15 claims in 3 days.


Let: 𝑍 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑖𝑚𝑠 𝑖𝑛 3 𝑑𝑎𝑦𝑠
Then: 𝑍 ~ 𝑃𝑜(𝜆 = 15).
Thus by using the table we find:
𝑃(𝑍 ≥ 7) = 1 − 𝑃(𝑍 ≤ 6)
= 1 − 0.0076 = 𝟎. 𝟗𝟗𝟐𝟒
(e) Determine the probability that at least 4 claims are received on at most 3 of the 5
working days of a given week.

𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 4 𝑐𝑙𝑎𝑖𝑚𝑠 𝑝𝑒𝑟 𝑑𝑎𝑦) = 𝑃(𝑋 ≥ 4)


= 1 − 𝑃(𝑋 ≤ 3)
= 1 − 0.2650
= 0.7350

Let: 𝐻 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑦𝑠 𝑜𝑢𝑡 𝑜𝑓 5 𝑜𝑛 𝑤ℎ𝑖𝑐ℎ 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 4 𝑐𝑙𝑎𝑖𝑚𝑠 𝑎𝑟𝑒 𝑟𝑒𝑐𝑒𝑖𝑣𝑒𝑑.


Then: 𝑯 ~𝒃𝒊𝒏𝒐𝒎(𝒏 = 𝟓; 𝒑 = 𝟎. 𝟕𝟑𝟓).
Therefore:
𝑷(𝑿 ≤ 𝟑) = 𝑝(0) + 𝑝(1) + 𝑝(2) + 𝑝(3)
5 5
= ( ) (0.735)0 (1 − 0.735)5−0 + ( ) (0.735)1 (1 − 0.735)5−1
0 1
5 (0.735)2 (1 5
+( ) − 0.735)5−2 + ( ) (0.735)3 (1 − 0.735)5−3
2 3
= 𝟎. 𝟑𝟗𝟖𝟖
Geometric Distribution: 𝑿~𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝒑)
When can the geometric distribution be used?
• We have an experiment with two possible outcomes, called success and failure. This
experiment is repeated under the same conditions.
• The repetitions (trials) are independent.
• P(success) = p remains constant throughout all the repetitions. (fixed successes)

If these requirements are met, the random variable X is defined as X = the number of
repetitions required to obtain the first success. The probability distribution of X is a
geometric distribution with parameter p. This is denoted by 𝑿~𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝒑)
The pmf of X is given by:
𝒑(𝒏) = 𝑷(𝑿 = 𝒏) = (𝟏 − 𝒑)𝒏−𝟏 𝒑, 𝒇𝒐𝒓 𝒏 = 𝟏, 𝟐, 𝟑 …
Derivation of the pmf of a geometric distribution:

𝒑(𝒏) = 𝑷(𝑿 = 𝒏) = (𝟏 − 𝒑)𝒏−𝟏 𝒑, 𝒇𝒐𝒓 𝒏 = 𝟏, 𝟐, 𝟑 …

Relationship between Binomial distribution & Geometric distribution


Binomial distribution Geometric distribution
A fixed number of repetitions A fixed number of successes (i.e. 1)
The number of successes is random The number of trials is random
X = the number of successes in n trials X = the number of trials until the 1st success
Textbook Examples
Example 4.8a
An urn contains N white and M black balls. Balls are randomly selected, one at a time, until a
black one is obtained. If we assume that each ball selected is replaced before the next one is
drawn, what is the probability that:
a) Exactly n draws are needed?

Solution:
Let: 𝑋 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑟𝑎𝑤𝑠 𝑛𝑒𝑒𝑑𝑒𝑑 𝑡𝑜 𝑠𝑒𝑙𝑒𝑐𝑡 𝑎 𝑏𝑙𝑎𝑐𝑘 𝑏𝑎𝑙𝑙
𝑴 𝑁
Thus: 𝑃(𝑠𝑢𝑐𝑐𝑒𝑠𝑠) = 𝑃(𝑏𝑙𝑎𝑐𝑘 𝑏𝑎𝑙𝑙) = 𝒑 = and 1 − 𝑝 =
𝑴+𝑵 𝑀+𝑁
𝑛−1
𝑛−1
𝑁 𝑴 𝑴𝑵𝒏−𝟏
𝑷(𝑿 = 𝒏) = (1 − 𝑝) .𝒑 = ( ) ∙( )=
𝑀+𝑁 𝑴+𝑵 (𝑴 + 𝑵)𝒏
b) At least k draws are needed?

𝑛−1∞
𝑀 𝑁
𝑷(𝑿 ≥ 𝒌) = ( )∑( )
𝑀+𝑁 𝑀+𝑁
𝑛=𝑘
𝑛−1
𝑀 𝑁
(𝑀 + 𝑁) (𝑀 + 𝑁)
=
𝑁
[1 − ]
𝑀+𝑁
𝑘−1
𝑁
=( )
𝑀+𝑁
= (𝟏 − 𝒑)𝒌−𝟏

Expected Value & Variance of Geometric Distribution


𝟏
𝑬(𝑿) =
𝒑
and
𝟏−𝒑
𝑽𝒂𝒓(𝑿) =
𝒑𝟐

Memoryless property of the geometric distribution


If 𝑿~𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝒑), then:
𝑷(𝑿 = 𝒏 + 𝒌 | 𝑿 > 𝒏) = 𝑷(𝑿 = 𝒌)
We say that the geometric distribution is memoryless
Proof
𝑿~𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝒑) ⇒ 𝑝(𝑛) = 𝑝(1 − 𝑝)𝑛−1
𝑷(𝑿 > 𝒏) = 𝑃(𝑋 ≥ 𝑛 + 1)

= ∑ 𝑝(1 − 𝑝)𝑥−1
𝑥=𝑛+1

= 𝑝 ∑ (1 − 𝑝)𝑥−1
𝑥=𝑛+1

= 𝑝[(1 − 𝑝)𝑛 + (1 − 𝑝)𝑛+1 + ⋯ ]


1
= 𝑝(1 − 𝑝)𝑛 ∙
1 − (1 − 𝑝)
1
= 𝑝(1 − 𝑝)𝑛 ∙
𝑝
= (𝟏 − 𝒑)𝒏

𝑃(𝑋 = 𝑛 + 𝑘 𝐴𝑁𝐷 𝑋 > 𝑛)


𝑷(𝑿 = 𝒏 + 𝒌 | 𝑿 > 𝒏) =
𝑃(𝑋 > 𝑛)
𝑃(𝑋 = 𝑛 + 𝑘)
=
𝑃(𝑋 > 𝑛)
𝑝(1 − 𝑝)𝑛+𝑘−1
=
(1 − 𝑝)𝑛
𝑝(1 − 𝑝)𝑛 . (1 − 𝑝)𝑘−1
=
(1 − 𝑝)𝑛
= 𝑝(1 − 𝑝)𝑘−1
= 𝑷(𝑿 = 𝒌)

Example
Let’s suppose k = 2 and n = 3. Then: 𝑃(𝑋 = 5 | 𝑋 > 3) = 𝑃(𝑋 = 2)
X > 3 means that the first 3 trials were all failures. So, the required conditional probability is
the probability that the 5th trial is the first success, given that the first 3 trials were all failures.
However, due to the independence of the trials, we can actually see the 4th trial as a new
“starting point” (the 3 failures that have already occurred have no effect on the future
outcomes), and we therefore only have to calculate the probability that the 2nd trial is a
success.
Negative Binomial Distribution: 𝑿~𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆 𝒃𝒊𝒏𝒐𝒎𝒊𝒂𝒍(𝒓, 𝒑)
When can the negative binomial distribution be used?
• We have an experiment with two possible outcomes, called success and failure. This
experiment is repeated under the same conditions.
• The repetitions (trials) are independent.
• P(success) = p remains constant throughout all the repetitions.

If these requirements are met, the random variable X is defined as X = the number of
repetitions required to obtain the rth success. The probability distribution of X is a negative
binomial distribution with parameters r & p. This is denoted by 𝑿~𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆 𝒃𝒊𝒏𝒐𝒎(𝒓, 𝒑)
Note:
• The geometric distribution is a special case of negative binomial distribution (i.e. r = 1)
o Geometric distribution describes: number of trials for 1st success
o Negative binomial distribution describes: number of trials for rth (r ≥1) success

The pmf of X is given by:


𝒏 − 𝟏 (𝟏
𝒑(𝒏) = 𝑷(𝑿 = 𝒏) = ( ) − 𝒑)𝒏−𝒓 𝒑𝒓 , 𝒇𝒐𝒓 𝒏 = 𝒓, 𝒓 + 𝟏, 𝒓 + 𝟐, …
𝒓−𝟏
Derivation of the pmf of a negative binomial distribution:
Note:
• 𝑿~𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆 𝒃𝒊𝒏𝒐𝒎(𝒓, 𝒑) ⇒ X = Y1 + 𝑌2 + ⋯ + 𝑌𝑟 , where Y1, Y2 and Yr are
independent and identically distributed random variables, each having a geometric (p)
distribution.

Negative Binomial as a Sum of Geometric

Relationship between Binomial & the Negative Binomial distribution


Binomial distribution Negative Binomial distribution
A fixed number of repetitions A fixed number of successes, namely r
The number of successes is random The number of trials is random
X = the number of successes in n trials X = the number of trials until the rth success

𝑃(𝑚𝑜𝑟𝑒 𝑡ℎ𝑎𝑛 𝒏 𝑡𝑟𝑖𝑎𝑙𝑠 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑡𝑜 𝑜𝑏𝑡𝑎𝑖𝑛 𝒓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑒𝑠)


= 𝑃(𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝒓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑖𝑛 𝒓 𝑡𝑟𝑖𝑎𝑙𝑠)
So if 𝑿~𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆 𝒃𝒊𝒏𝒐𝒎(𝒓, 𝒑) and 𝒀~𝒃𝒊𝒏𝒐𝒎(𝒏, 𝒑) then:
𝑷(𝑿 > 𝒏) = 𝑷(𝒀 < 𝒓)
Link between Binomial & Negative Binomial

∞ 𝑟−1 ∞ 𝑟−1
𝑖 − 1 𝑟 (1 𝑛
∑ 𝑃(𝑋 = 𝑖) = ∑ 𝑃(𝑌 = 𝑖) ∴ ∑ ( )𝑝 − 𝑝)𝑖−𝑟 = ∑ ( ) 𝑝𝑖 (1 − 𝑝)𝑛−𝑖
𝑟−1 𝑖
𝑖=𝑛+1 𝑖=0 𝑖=𝑛+1 𝑖=0

Expected Value & Variance of Negative Binomial Distribution


𝒓
𝑬(𝑿) =
𝒑
𝒓(𝟏 − 𝒑)
𝑽𝒂𝒓(𝑿) =
𝒑𝟐
Textbook Examples
Example 4.8g
Find the expected value and the variance of the number of times one must throw a die until
the outcome 1 has occurred 4 times.
Solution:
1
𝑋~𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑏𝑖𝑛𝑜𝑚 (𝑟 = 4, 𝑝 = )
6
4
𝑬(𝑿) = = 𝟐𝟒
1/6
1
4 (1 − )
𝑽𝒂𝒓(𝑿) = 6 = 𝟏𝟐𝟎
(1/6)2
Example 4.8d
If independent trials, each resulting in a success with probability p, are performed, what is
the probability of r successes occurring before m failures?
Solution:
First consider a special case: say with r = 3 and m = 5, i.e. the required probability is the
probability of 3 successes before 5 failures.

∴ 𝑃(3 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑏𝑒𝑓𝑜𝑟𝑒 5 𝑓𝑎𝑖𝑙𝑢𝑟𝑒𝑠) = 𝑃(3𝑟𝑑 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑜𝑛 3𝑟𝑑 𝒐𝒓 4𝑡ℎ 𝒐𝒓 … 𝒐𝒓 7𝑡ℎ 𝑡𝑟𝑖𝑎𝑙)
= 𝑃(𝑋 ≤ 7), 𝑤ℎ𝑒𝑟𝑒 𝑋~𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑏𝑖𝑛𝑜𝑚(𝑟 = 3, 𝑝)
7
𝑥 − 1 3 (1
= ∑( )𝑝 − 𝑝)𝑥−3
2
𝑥=3
𝒓+𝒎−𝟏
𝒙 − 𝟏 𝒓 (𝟏
∴ 𝑃(𝑟 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑏𝑒𝑓𝑜𝑟𝑒 𝑚 𝑓𝑎𝑖𝑙𝑢𝑟𝑒𝑠) = ∑ ( )𝒑 − 𝒑)𝒙−𝒓
𝒓−𝟏
𝒙=𝒓
Additional Examples
Example 1
A coin has P(heads) = 0.4. The coin is tossed repeatedly under identical circumstances.
(a) Write down the pmf of the number of heads obtained in 20 tosses of the coin.

Solution:
Let: 𝑋 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑒𝑎𝑑𝑠 𝑖𝑛 20 𝑡𝑜𝑠𝑠𝑒𝑠
Thus: 𝑿~𝒃𝒊𝒏𝒐𝒎(𝒏 = 𝟐𝟎, 𝒑 = 𝟎. 𝟒)
𝟐𝟎
𝒑(𝒙) = ( ) (𝟎. 𝟒)𝒙 (𝟎. 𝟔)𝟐𝟎−𝒙 , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, … 𝟐𝟎
𝒙

(b) Calculate the probability of at most 7 heads in 20 tosses.

Solution:
𝑃(𝑎𝑡 𝑚𝑜𝑠𝑡 7 ℎ𝑒𝑎𝑑𝑠 𝑖𝑛 20 𝑡𝑜𝑠𝑠𝑒𝑠) = 𝑃(𝑋 ≤ 7)
= 𝟎. 𝟒𝟏𝟓𝟗

(c) Write down the pmf of the number of tosses required to obtain the first heads.

Solution:
Let 𝑌 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑠𝑠𝑒𝑠 𝑢𝑛𝑡𝑖𝑙 1𝑠𝑡 ℎ𝑒𝑎𝑑
Thus: 𝒀~𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝒑 = 𝟎. 𝟒)
𝒑(𝒚) = (𝟎. 𝟒)(𝟎. 𝟔)𝒚−𝟏 , 𝒇𝒐𝒓 𝒚 = 𝟏, 𝟐, 𝟑, …

(d) Calculate the probability that at least 3 tosses are required to obtain the first heads.

Solution:
𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 3 𝑡𝑜𝑠𝑠𝑒𝑠 𝑢𝑛𝑡𝑖𝑙 1𝑠𝑡 ℎ𝑒𝑎𝑑𝑠) = 𝑃(𝑌 ≥ 3)
= 1 − 𝑃(𝑌 ≤ 2)
= 1 − 𝑝(1) − 𝑝(2)
= 1 − (0.4)(0.6)1−1 − (0.4)(0.6)2−1
= 𝟎. 𝟑𝟔
(e) Write down the pmf of the number of tosses required to obtain 5 heads.

Solution:
Let 𝑍 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑠𝑠𝑒𝑠 𝑢𝑛𝑡𝑖𝑙 5𝑡ℎ ℎ𝑒𝑎𝑑𝑠
Thus: 𝒁~𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆 𝒃𝒊𝒏𝒐𝒎(𝒓 = 𝟓, 𝒑 = 𝟎. 𝟒)
𝒛−𝟏
𝒑(𝒛) = ( ) (𝟎. 𝟒)𝟓 . (𝟎. 𝟔)𝒛−𝟓 , 𝒇𝒐𝒓 𝒛 = 𝟓, 𝟔, 𝟕, …
𝟓−𝟏

(f) Calculate the probability that at most 10 tosses are required to obtain 5 heads.

Solution 1: Negative binomial distribution (𝑍~𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑏𝑖𝑛𝑜𝑚(𝑟 = 5, 𝑝 = 0.4))


𝑃(𝑎𝑡 𝑚𝑜𝑠𝑡 10 𝑡𝑜𝑠𝑠𝑒𝑠 𝑢𝑛𝑡𝑖𝑙 5𝑡ℎ ℎ𝑒𝑎𝑑𝑠) = 𝑃(𝑍 ≤ 10)
10
𝑧−1
= ∑( ) (0.4)5 . (0.6)𝑧−5
5−1
𝑧=5

= 𝟎. 𝟑𝟔𝟔𝟗
Solution 2: Link between binomial and negative binomial (𝑃(𝑋 > 𝑛) = 𝑃(𝑌 < 𝑟))

𝒁~𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆 𝒃𝒊𝒏𝒐𝒎(𝒓 = 𝟓, 𝒑 = 𝟎. 𝟒)
𝑾~𝒃𝒊𝒏𝒐𝒎(𝒏 = 𝟏𝟎, 𝒑 = 𝟎. 𝟒)

𝑃(𝑎𝑡 𝑚𝑜𝑠𝑡 10 𝑡𝑜𝑠𝑠𝑒𝑠 𝑢𝑛𝑡𝑖𝑙 5𝑡ℎ ℎ𝑒𝑎𝑑𝑠) = 𝑃(𝑍 ≤ 10)


= 1 − 𝑃(𝑍 > 10)
= 1 − 𝑃(𝑊 < 5)
= 1 − 𝑃(𝑊 ≤ 4)
= 1 − 0.6331
= 𝟎. 𝟑𝟔𝟔𝟗
Hypergeometric Distribution: 𝑿~𝒉𝒚𝒑𝒆𝒓𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝑵, 𝒏, 𝒎)
When can the hypergeometric distribution be used?
• Consider a population of N elements consisting of m successes and N – m failures.
• A sample of n elements is selected randomly and without replacement from this
population.

If these requirements are met, the random variable X is defined as X = the number of
successes obtained in the sample. The probability distribution of X is a hypergeometric
distribution with parameters N, n & m. This is denoted by 𝑿~𝒉𝒚𝒑𝒆𝒓𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝑵, 𝒏, 𝒎)
Note:
• If the sample was selected with replacement or if the population is much larger than the
𝒎
sample, then 𝑿~𝒃𝒊𝒏𝒐𝒎𝒊𝒂𝒍 (𝒏, )
𝑵

𝑚 𝑁−𝑚
( )( )
𝑃(𝑋 = 𝑖) =
𝑖 𝑛 − 𝑖 = ⋯ = (𝒏) 𝒑𝒊 (𝟏 − 𝒑)𝒏−𝒊
𝑁 𝒊
( )
𝑛
But since it is without replacement, the pmf of X is given by:
𝒎 𝑵−𝒎
( )( )
𝒊 𝒏 − 𝒊
𝑷(𝑿 = 𝒊) = , 𝑓𝑜𝑟 𝑖 = 0,1,2, … , 𝑛
𝑵
( )
𝒏
Derivation of the pmf of a hypergeometric distribution:
Additional Example
Example 2
Suppose that a box contains 10 black balls and 20 red balls. If 5 balls are selected at random
without replacement, what is the probability that 2 black balls will be obtained?
Solution:
Let: 𝑋 = 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑏𝑙𝑎𝑐𝑘 𝑏𝑎𝑙𝑙𝑠 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑
Thus: 𝑿~𝒉𝒚𝒑𝒆𝒓𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄(𝑵 = 𝟑𝟎, 𝒏 = 𝟓, 𝒎 = 𝟏𝟎)
Therefore:
𝑃(2 𝑏𝑙𝑎𝑐𝑘 𝑏𝑎𝑙𝑙𝑠 𝑎𝑟𝑒 𝑜𝑏𝑡𝑎𝑖𝑛𝑒𝑑) = 𝑃(𝑋 = 2)
10 30 − 10 10 20
( )( ) ( )( )
= 2 5−2 = 2 3
30 30
( ) ( )
5 5
𝟗𝟓𝟎
= = 𝟎. 𝟑𝟓𝟗𝟗
𝟐 𝟔𝟑𝟗
Note:
• If we were selecting balls with replacement in this example, we would have:

𝟏𝟎 𝟏
𝑿~𝒃𝒊𝒏𝒐𝒎𝒊𝒂𝒍 (𝒏 = 𝟓, 𝒑 = = )
𝟑𝟎 𝟑
Textbook Example
Example 4.8i
A purchaser of electrical components buys them in lots of size 10. It is his policy to inspect 3
components randomly from a lot and to accept the lot only if all 3 are non-defective. If 30
percent of the lots have 4 defective components and 70 percent have only 1, what
proportion of lots does the purchaser reject?
Solution:
Let: 𝐴 = 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑟 𝑎𝑐𝑐𝑒𝑝𝑡𝑠 𝑎 𝑙𝑜𝑡
3 7
𝑃(𝐴) = 𝑃(𝐴 | 𝑙𝑜𝑡 ℎ𝑎𝑠 4 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒) (
) + 𝑃(𝐴 | 𝑙𝑜𝑡 ℎ𝑎𝑠 1 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒) ( )
10 10
4 6 1 9
( )( ) 3 ( )( ) 7 54
= 0 3 ( )+ 0 3 ( )=
10 10
( ) 10 ( ) 10 100
3 3
Hence, 46% of the lots are rejected
Expected Value & Variance of Hypergeometric Distribution
𝒏𝒎
𝑬(𝑿) =
𝑵
and
𝒏𝒎 (𝒏 − 𝟏)(𝒎 − 𝟏) 𝒏𝒎
𝑽𝒂𝒓(𝑿) = [ +𝟏− ]
𝑵 𝑵−𝟏 𝑵
𝑚
Letting 𝑝 =
𝑁

𝑛−1
𝑉𝑎𝑟(𝑋) = 𝑛𝑝(1 − 𝑝) [1 − ]
𝑁−1
Note:
• 𝐸[(1 − 𝑝)𝑋 ] = (1 − 𝑝2 )𝑛
1 1−(1−𝑝)𝑛+1
• 𝐸[ ]= (𝑛+1)𝑝
𝑋+1

• 𝐸(𝑋 𝑛 ) = 𝜆𝐸[(𝑋 + 1)𝑛−1 ] → Poisson distribution

Additional Examples
Suppose the number of calls handled by a switchboard is Poisson distributed with a mean of
2 calls per minute.
Let Xt = number of calls handled in t minutes, t > 0.
Then 𝑋𝑡 ~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(2𝑡)
(a) Write down the probability mass function of the number of calls that are handled in a
5-minute period.

Solution:
𝑋5 ~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜆 = 2(5) = 10)

−𝟏𝟎
𝟏𝟎𝒙
𝒑(𝒙) = 𝑷(𝑿𝟓 = 𝒙) = 𝒆 ∙ , 𝒇𝒐𝒓 𝒙 = 𝟎, 𝟏, 𝟐, …
𝒙!

(b) Calculate the probability that 15 calls have to be handled in a period of 6 minutes.

Solution:
𝑋6 ~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜆 = 2(6) = 12)

−𝟏𝟐
𝟏𝟐𝟏𝟓
𝒑(𝟏𝟓) = 𝑷(𝑿𝟔 = 𝟏𝟓) = 𝒆 ∙ = 𝟎. 𝟎𝟕𝟐𝟒
𝟏𝟓!
(c) Calculate the probability that at least 12 calls have to be handled in a period of 5
minutes.

Solution:
𝑋5 ~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜆 = 2(5) = 10)
𝑷(𝑿𝟓 ≥ 𝟏𝟐) = 𝟏 − 𝑷(𝑿𝟓 ≤ 𝟏𝟏)
= 𝟏 − 𝟎. 𝟔𝟗𝟔𝟖
= 𝟎. 𝟑𝟎𝟑𝟐

(d) Calculate the probability that at most 7 calls have to be handled in at least 3 of 5
consecutive 3-minute periods.

Solution:
𝑋3 ~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜆 = 2(3) = 6)
𝑷(𝑿𝟑 ≤ 𝟕) = 𝟎. 𝟕𝟒𝟒𝟎

Let: 𝑌 = # 𝑜𝑓 3 − 𝑚𝑖𝑛𝑢𝑡𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠 𝑑𝑢𝑟𝑖𝑛𝑔 𝑤ℎ𝑖𝑐ℎ 𝑎𝑡 𝑚𝑜𝑠𝑡 7 𝑐𝑎𝑙𝑙𝑠 𝑤𝑒𝑟𝑒 ℎ𝑎𝑛𝑑𝑙𝑒𝑑


Then: 𝑌~𝐵𝑖𝑛𝑜𝑚(𝑛 = 5, 𝑝 = 0.744)
𝑃(𝑌 ≥ 3) = 𝑃(𝑌 = 3) + 𝑃(𝑌 = 4) + 𝑃(𝑌 = 5)
5
5
= ∑ ( ) (0.744)𝑦 . (0.256)5−𝑦 = 𝟎. 𝟖𝟗𝟎𝟏
𝑦
𝑦=3
Textbook Exercises
Question 74: Geometric distribution
Consider a roulette wheel consisting of 38 numbers 1 through 36, 0, and double 0. If Smith
always bets that the outcome will be one of the numbers 1 through 12, what is the probability
that
(a) Smith will lose his first 5 bets

Solution:
12
Let: 𝑝 = 𝑃( ℎ𝑒 𝑤𝑖𝑛𝑠 𝑎𝑡 𝑎 𝑠𝑖𝑛𝑔𝑙𝑒 𝑔𝑎𝑚𝑒) =
38

𝑷(𝒍𝒐𝒔𝒆𝒔 𝒇𝒊𝒓𝒔𝒕 𝟓 𝒃𝒆𝒕𝒔) = (𝟏 − 𝒑)𝟓


𝟐𝟔 𝟓
=( )
𝟑𝟖
= 𝟎. 𝟏𝟒𝟗𝟗
(b) his first win will occur on his fourth bet?

Solution:
Let 𝑋 = #𝑔𝑎𝑚𝑒𝑠 𝑢𝑛𝑡𝑖𝑙 1𝑠𝑡 𝑤𝑖𝑛
12
Thus: 𝑋~𝑔𝑒𝑜𝑚𝑒𝑡𝑟𝑖𝑐 (𝑝 = )
38

𝑝(𝑛) = 𝑃(𝑋 = 𝑛) = (1 − 𝑝)𝑛−1 𝑝, 𝑓𝑜𝑟 𝑥 = 0,1,2


𝟏𝟐 𝟒−𝟏 𝟏𝟐
𝒑(𝟒) = 𝑷(𝑿 = 𝟒) = (𝟏 − ) ( ) = 𝟎. 𝟏𝟎𝟏𝟏𝟓
𝟑𝟖 𝟑𝟖
Question 77: Negative Binomial Distribution
An interviewer is given a list of people she can interview. If the interviewer needs to interview
5 people, and if each person (independently) agrees to be interviewed with probability 2/3,
what is the probability that her list of people will enable her to obtain her necessary number
of interviews if the list consists of (a) 5 people and (b) 8 people? For part (b), what is the
probability that the interviewer will speak to exactly (c) 6 people and (d) 7 people on the
list?
Solution:
Let: 𝑋 = #𝑝𝑒𝑜𝑝𝑙𝑒 𝑠ℎ𝑒 ℎ𝑎𝑠 𝑡𝑜 𝑎𝑝𝑝𝑟𝑜𝑎𝑐ℎ 𝑢𝑛𝑡𝑖𝑙 𝑠ℎ𝑒 𝑔𝑒𝑡𝑠 𝑡ℎ𝑒 5𝑡ℎ 𝑖𝑛𝑡𝑒𝑟𝑣𝑖𝑒𝑤𝑒𝑟
2
Then: 𝑋~𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑏𝑖𝑜𝑛𝑜𝑚 (𝑟 = 5, 𝑝 = )
3

𝑛−1 2 𝑛−5 2 𝑟 𝑛−1 1


𝑛−5
2 𝑟
𝑝(𝑛) = 𝑃(𝑋 = 𝑛) = ( ) (1 − ) ( ) =( )( ) ( )
5−1 3 3 4 3 3
a) 5 people

Solution:
5−5
5−1 1 2 5
𝑷(𝑿 = 𝟓) = ( )( ) ( )
4 3 3
𝟐 𝟓
= ( ) = 𝟎. 𝟏𝟑𝟏𝟕
𝟑
b) 8 people

Solution:
8 𝑥−5
𝑥−1 1 2 5
𝑷(𝑿 ≤ 𝟖) = ∑ ( )( ) ( )
4 3 3
𝑥=5

𝟐 𝟓 𝟏 𝟏 𝟐 𝟏 𝟑
= ( ) [𝟏 + 𝟓 ( ) + 𝟏𝟓 ( ) + 𝟑𝟓 ( ) ]
𝟑 𝟑 𝟑 𝟑
= 𝟎. 𝟕𝟒𝟏𝟒
c) Exactly 6 people

Solution:
6−5
6−1 1 2 5
𝑷(𝑿 = 𝟔) = ( )( ) ( )
4 3 3
1 1 2 5
= 5( ) ( )
3 3
= 𝟎. 𝟐𝟏𝟗𝟓
d) Exactly 7 people

Solution:
7−5
7−1 1 2 5
𝑷(𝑿 = 𝟕) = ( )( ) ( )
4 3 3
1 2 2 5
= 15 ( ) ( )
3 3
= 𝟎. 𝟐𝟏𝟗𝟓
Question 78: Negative Binomial Distribution
During assembly, a product is equipped with 5 control switches, each of which has probability
0.04 of being defective. What is the probability that 2 defective switches are encountered
before 5 non-defective ones?
Solution:
Let: 𝑌 = # 𝑡𝑟𝑖𝑎𝑙𝑠 𝑡𝑜 𝑔𝑒𝑡 𝑟 𝑠𝑢𝑐𝑐𝑒𝑠𝑠es, Thus: 𝑌~𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑟 = 5, p = 0.96)
Let: 𝑋 = # 𝑓𝑎𝑖𝑙𝑢𝑟𝑒𝑠 𝑏𝑒𝑓𝑜𝑟𝑒 𝑟 𝑡ℎ 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 , Thus: 𝑋 = 𝑌 − 𝑟
𝑥 + 𝑟 − 1 𝑟 (1
∴ 𝑝(𝑥) = ( )𝑝 − 𝑝)𝑥 , 𝑓𝑜𝑟 𝑥 = 0,1,2, …
𝑟−1
𝟐 + 𝟓 − 𝟏 (𝟎.
∴ 𝑝(2) = ( ) 𝟗𝟔)𝟓 (𝟎. 𝟎𝟒)𝟐
𝟓−𝟏
𝟔
= ( ) (𝟎. 𝟗𝟔)𝟓 (𝟎. 𝟎𝟒)𝟐
𝟒
= 𝟎. 𝟎𝟏𝟗𝟓𝟕
Question 82: Hypergeometric Distribution
Suppose that a class of 50 students has appeared for a test. 41 students have passed this test
while the remaining 9 students have failed. Find the probability that in a group of 10 students
selected at random
(a) none have failed the test

Solution:
Let: 𝑋 = #𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑤ℎ𝑜 𝑓𝑎𝑖𝑙𝑒𝑑
Thus: 𝑋~ℎ𝑦𝑝𝑒𝑟𝑔𝑒𝑜𝑚𝑒𝑡𝑟𝑖𝑐(𝑁 = 50, 𝑚 = 9, 𝑛 = 10)
𝟗 𝟒𝟏
( )( )
𝑷(𝑿 = 𝟎) = 𝟎 𝟏𝟎 = 𝟎. 𝟏𝟎𝟗𝟏𝟒
𝟓𝟎
( )
𝟏𝟎
(b) at least 3 students have failed the test.

Solution:
𝑷(𝑿 ≥ 𝟑) = 1 − 𝑃(𝑋 ≤ 2)
= 1 − 𝑝(0) − 𝑝(1) − 𝑝(2)
9 41 9 41 9 41
( )( ) + ( )( ) + ( )( )
= 1 − 0 10 1 9 2 8
50
( )
10
= 𝟎. 𝟐𝟒𝟗𝟎𝟓
Question 85: Hypergeometric or Binomial
An automotive manufacturing company produces brake pads in lots of 100. This company
inspects 15 brake pads from each lot and accepts the whole lot only if all 15 brake pads pass
the inspection test. Each brake pad is, independently of the others, faulty with probability
0.09. What proportion of the lots does the company reject?
Solution 1: Hypergeometric & Binomial
Let: 𝑋 = #𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑎 𝑙𝑜𝑡 𝑜𝑓 100
Thus: 𝑋~𝑏𝑖𝑛𝑜𝑚(𝑛 = 100, 𝑝 = 0.09)
Let: 𝑌 = #𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒 𝑜𝑓 15 𝑖𝑛𝑠𝑝𝑒𝑐𝑡𝑒𝑑
Thus: (𝑌|𝑋 = 𝑘)~ℎ𝑦𝑝𝑒𝑟𝑔𝑒𝑜𝑚𝑒𝑡𝑟𝑖𝑐(𝑁 = 100, 𝑛 = 15, 𝑚 = 𝑘)
𝑃(𝑎 𝑙𝑜𝑡 𝑖𝑠 𝑟𝑒𝑗𝑒𝑐𝑡𝑒𝑑) = 1 − 𝑃(𝑎 𝑙𝑜𝑡 𝑖𝑠 𝑎𝑐𝑐𝑒𝑝𝑡𝑒𝑑)
= 1 − 𝑃(𝑎𝑙𝑙 𝑖𝑛𝑠𝑝𝑒𝑐𝑡𝑒𝑑 𝑖𝑡𝑒𝑚𝑠 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒)
85

= 1 − ∑ 𝑃(𝑋 = 𝑘 𝐴𝑁𝐷 𝑌 = 0)
𝑘=0
85

= 1 − ∑ 𝑃(𝑋 = 𝑘). 𝑃(𝑌 = 0|𝑋 = 𝑘)


𝑘=0

85 𝑘 100 − 𝑘
( )( )
100 𝑘 100−𝑘 0 15
= 1−∑( ) (0.09) . (0.91) ×
𝑘 100
𝑘=0 ( )
15
85 𝑘
(0.91)100 85 9
=1− ∑( )( )
𝑘 91
𝑘=0

9 85
= 1 − (0.91)100 ( + 1) (binomial theorem)
91

= 𝟎. 𝟕𝟓𝟔𝟗𝟗
Solution 2: Binomial
𝑃(𝑎 𝑙𝑜𝑡 𝑖𝑠 𝑟𝑒𝑗𝑒𝑐𝑡𝑒𝑑) = 1 − 𝑃(𝑎 𝑙𝑜𝑡 𝑖𝑠 𝑎𝑐𝑐𝑒𝑝𝑡𝑒𝑑)
15
= 1 − ( ) (1 − 0.09)15 (0.09)0
15
= 𝟎. 𝟕𝟓𝟔𝟗𝟗

You might also like