Conditional
Conditional
Conditional Probability
Multiplication rule
Theorem of total probability
Rejection sampling
Bayes' theorem
Use of Bayes' theorem
Urn Models
What are they?
Why care?
Fallacies regarding conditional probability
Mistaking P (A|B)for P (B|A)
Simpson's Paradox
Monty Hall problem
Problems for practice
Conditional Probability
Probability that a coin toss would result in a head is a statement more
about our ignorance regarding the outcome than an absolute property of
the coin. If our ignorance level changes (eg, if we get some new
information) the probability may change. We deal with this
mathematically using the concept of conditional probability.
12
7
.
We cannot use the same notation P (triangle) for this new quantity. We
need a new notation that reflects our extra information. The new
notation is P (triangle|green). We call it the conditional probability
of
the selected shape being a triangle giventhat it is green. ■
In general, the notation is P (A|B) where A, B are any two events. The
mathematical definition is just as it should be. Instead of the entire
sample space Ω you now narrow you focus down to only B. So A is now
narrowed down to A ∩ B. So P (A|B) actually measures the P (A ∩ B)
relative to P (B). Hence the definition is:
Definition: Conditional probability
If A, B are any two events with P (B) > 0 then
P (A ∩ B)
P (A|B) = .
P (B)
Theorem
Consider a probability P on some sample space. Fix any event B
with P (B) > 0. For all event A define P (A) as P (A) = P (A|B).
′ ′
The first two axioms obviously hold! For the third axiom, let A 1, A2 , . . .
[QED]
::
Multiplication rule
::
This form is particularly useful when A, B are events such that A indeed
occurs before B in the real world. Here is an example.
EXAMPLE 2: A box contains 5 red and 3 green balls. One ball is drawn
at random, its colour is noted, and is replaced back. Then one more ball
of the same colour is added. Then a second ball is drawn. What is the
probability that both the balls are green?
What is the probability that the first ball is green? The answer is
. Before drawing the second ball, the composition of the box
3
P (A) =
8
has changed depending on the outcome of the first stage. This is where
conditional probability helps. Given that the first ball was green, we
know the composition of the box before the second drawing: 5 red and
3 + 1 = 4 green. So P (B|A) =
4
.
9
8
×
4
9
=
1
6
.
balls = c('r','r','r','r','r','g','g','g')
event = c()
for(i in 1:5000) {
first.draw = sample(balls,1)
newballs = c(balls,first.draw)
second.draw = sample(newballs,1)
event[i] = (first.draw=='g' && second.draw=='
}
mean(event)
■
Run in cloud
In a similar way, you can prove (by induction) the following theorem.
Multiplication rule
Let A 1, . . . , An , B be events such that P (A
1 ∩ ⋯ ∩ An ) > 0. Then
1
Proof: The following diagram illustrates the situation.
Since A 1 ∪ ⋯ ∪ An = Ω,
hence B = B ∩ Ω = (B ∩ A 1) ∪ ⋯ ∪ (B ∩ An ).
Also, since A 's are disjoint, hence B ∩ A 's are disjoint as well.
i i
So P (B) = ∑ n
1
P (B ∩ Ai ) = ∑
n
1
P (Ai )P (B|Ai ), as required. [QED]
Rejection sampling
Suppose that ϕ ≠ A ⊆ B are finite sets. You have a list of all elements
of B. But you do not have a list of all elements of A. However, given any
element of B you can check if it is in A or not. In this case how can you
draw one element randomly from A?
::
repeat {
x = sample(6,1)
if (x<=5) break
}
Run in cloud
Bayes' theorem
Multi-stage random experiments are all around us. Many processes in
nature occur step by step, and each step involves some randomness.
Often the last layer of randomness is due to the measurement error.
Bayes' theorem is a way to "remove" this last layer to look deeper.
The theorem of total probability lets us move forward along the arrows,
while Bayes' theorem lets us move backwards.
P (A)P (B|A)
P (A|B) = .
c c
P (A)P (B|A) + P (A )P (B|A )
Proof: First think of the formula in terms of the following diagram. The
denominator is the probability of reaching B from Start. The numerator
is the probability of only the red path.
as required. [QED]
Bayes' theorem (version 2)
Let A , . . . , A be mutually exclusive and exhaustive events. Let B
1 n
any k = 1, . . . , n,
P (Ak )P (B|A)
P (Ak |B) = n
.
∑ P (Ai )P (B|Ai )
i=1
::
EXERCISE 4: Look at the following diagram and write down the proof.
c c
P (B) = P (A) ⋅ P (B|A) + P (A ) ⋅ P (B|A )
= 0.104
Diagrammatically, you can think like this. To find P (B), we consider all
paths from start to B. Multiply the probabilities along each path and add.
Thus P (B) = 0.1 × 0.95 + 0.9 × 0.01 = ⋯ Similarly to find (A ∩ B)
add the probabilities of all the paths from start to B through A.
P (B)
. ■
Urn Models
What are they?
An urn model is a multistage random experiment. It consists of one or
more boxes (called urns), each containing coloured balls (balls are all
distinct, even balls having the same colour). Balls are drawn at random
(using SRSWR or SRSWOR) and depending on the outcome, some balls
are added/removed/transferred. Then again a few balls are drawn, and
so on. Here is one example.
EXAMPLE 4: An urn contains 3 red and 3 green balls. One ball is drawn
at random, its colour noted, and returned to the urn. Then another ball
of the same colour is added to the urn. Then the same process is
repeated again and again. The possibilities grow like this:
1. What is the probability that at the 10-th stage we shall have 12 red
and 4 green balls?
2. What is the probability that the ball drawn at stage n is red?
3. Given that we have exactly 6 red balls at the 9-th stage, what is the
(conditional) probability that we had exactly 4 red balls at the 6-th
stage?
The above urn model is an example of the Polya Urn Model , where in
general we start with a red and b green balls, and at each stage a random
ball is selected, replaced and c more ball(s) of its colour is(are) added.
Why care?
You may see this link for further discussion. Some real life scenarios can
be mathematically treated as urn models.
We shall discuss two such examples.
An American lady comes to India. She has heard about the unheigenic
condition prevaling here, and is apprehensive about flu. Well, as luck
would have it, on her way from the airport she meets a man suffering
from flu. "Oh my," she shudders, "so the rumour about flu is not
unfounded, it seems!". The very next day her city tour is cancelled,
because the guide is down with flu. "What a terrible country this is!", the
lady starts to worry, "It is full of flu!" So imagine her panic when on the
third day she learns that a waiter in the hotel has caught the disease.
Polya's urn model captures this idea. A red ball means fear of flu, a
green ball means the opposite. Initially they were equal in number. The
lady met a flu case on day 1 (i.e., randonly selected a red ball), and her
fear deepened (one more red ball added). The man did not meet any flu
case in day 1 (green ball selected), so his courage increased (one more
green ball added). Yet, what is the chance of selecting a red ball at stage
1? It is still same as stage 0 (ie, the true prevalence rate of flu has not
1
However, one must understand that the real situation is far too complex
to be captured adequately by Polya's urn model. ■
No, what the parents learned from their survey was that P (B|A) is
large. This does not imply in any way that P (A|B) is large. They should
have surveyed the coaching goers and figured out the proportion that got
admitted. This proportion could have been (and most often is)
microscopically low.
Simpson's Paradox
Suppose that A 1, A2and B are three events such that
|B) and also P (A |B ) < P (A |B
c c
P (A1 |B) < P (A2 1 2 ).
Can you conclude from this that P (A 1) < P (A2 )? (Think before
clicking here.)
the race of the victim (i.e., the person murdered): white or black
the race of the defendant (i.e., the person accused): white or black
whether death penalty was given: yes or no.
The red and green parts give the actual data, the remaining numbers are
derived from them. For example the 11.3 is obtained as 53/(53 + 414).
The blue part is obtained by adding the red and green parts. For
example, 414 + 16 = 430.
Now consider the cases where the victim is white (the red part in the
table). Notice that for white defendants 11.3% got a death penalty, while
for black defendants the percentage is 22.9%. Thus if
Again, focusing on the green part we get a similar observation (0.0 <
2.8). So we infer P (A |B ) < P (A |B ).
1
c
2
c
the victim's race does not matter: a white defendant is always less likely
to get a death penalty.
So let's ignore the victim's race. This basically means adding the red and
green tables to get the blue table. Similar argument based on this
combined table, however, seems to indicate P (A ) > P (A ) since
1 2
What went wrong? This is called Simpson's paradox and often crops up
in practice.
The host of the program shows you three closed doors. You know
that a random one of these hides a car (considered a prize), the
remaining two doors hide goats (considered valueless). You are to
guess which door has the car. If you guess correctly, then you get
the car. Once you choose a door, the host opens some other door
and shows that there is a goat behind it. Now you are given an
option to switch to the other closed door. Should you switch?
Remember that the host knows the contents behind each door and
will always show you a door with a goat.
Here are two ways to think about this, both natural but leading to
opposite conclusions:
Here the sample space is {1, 2, 3}, the numbers denoting the possible
positions of the car. The unconditional probabilities were each. The
1
2
1
2
1
2
. So
nothing is to be gained by switching.
How to resolve the paradox? You might like to simulate the situation
using R. Allegedly, the famous mathematician G Polya was not
convinced about the correct answer until he was shown a computer
simulation!
car = sample(3,1000,rep=T)
host = c(3,2,3)
other = host[car]
sum(car==1)
sum(car==other)
times. Each time we freshly randomize the position of the car. This is
done in the first line of the code. We need a strategy for the host.
Remember that the door you selected first is called door 1. So the host's
strategy is like a function that maps car's true position to door to be kept
closed. If the car is not behind door 1, then the host has only one choice.
If the car is behind door 1, then the host can open either 2 or 3. Here,
w.l.g., we are keeping 3 closed. So the function is
host[1] = 3, host[2] = 3 and host[3] = 2. In other words, the strategy
[Hint]
::
[Hint]
::
P (A|B) > 0.99 but P (B|A) < 0.01" Disprove or provide an example
to this statement.
[Hint]
::
from (0, 0) returns to 0 for the first time at 2n. Then show without using
the explicit form of u and v that
2n 2n
[Hint]
::
[Hint]
::
[Hint]
::
EXERCISE 12: Two fair dice are rolled. What is the conditional
probability that at least one shows a 6 given that the dice show different
numbers?
[Hint]
::
EXERCISE 13: If two fair dice are rolled, what is the conditional
probability that the first one shows 6 given that the sum of the outcomes
of the dice is i? Compute for all possible values of i.
[Hint]
::
j − i ∈ {1, . . . , 6} and 0 else. Show that the probability that the counter
[Hint]
::
The matrix governs the random motion of a counter jumping back and
forth over this board in the following way: If the counter is at i then it
moves to j with probability p . (If i = j, then the counter stays put.) All
ij
moves are independent. Show that the probability of the counter moving
from i to j in exactly k moves is the (i, j)-th entry of the matrix A . k
[Hint]
::
[Hint]
::
EXERCISE 17:
[Hint]
::
EXERCISE 18:
[Hint]
::
EXERCISE 19:
[Hint]
::
EXERCISE 20:
[Hint]
::
EXERCISE 21:
[Hint]
::
EXERCISE 22:
[Hint]
::
EXERCISE 23:
[Hint]
::
EXERCISE 24:
[Hint]
::
EXERCISE 25:
[Hint]
::
EXERCISE 26:
[Hint]
::
EXERCISE 27:
[Hint]
::
EXERCISE 28:
[Hint]
::
EXERCISE 29:
[Hint]
::
EXERCISE 30:
[Hint]
::
EXERCISE 31:
[Hint]
::
EXERCISE 32:
[Hint]
::
EXERCISE 33:
[Hint]
::
EXERCISE 34:
[Hint]
::
EXERCISE 35:
[Hint]
::
EXERCISE 36:
[Hint]
::
EXERCISE 37:
[Hint]
::
EXERCISE 38:
[Hint]
::
EXERCISE 39:
[Hint]
::
EXERCISE 40:
[Hint]
::
EXERCISE 41:
[Hint]
::
EXERCISE 42:
[Hint]
::
EXERCISE 43:
[Hint]
::
EXERCISE 44:
[Hint]
::
EXERCISE 45:
times. What is the chance that the last outcome is different from all the
earlier ones?
[Hint]
::
EXERCISE 46:
[Hint]
::
EXERCISE 47:
[Hint]
::
EXERCISE 48:
[Hint]
::
EXERCISE 49:
[Hint]
::
EXERCISE 50:
[Hint]
::
[Hint]
::
EXERCISE 52: Same set up as in the last problem. Fix two natural
numbers m < n. What is the probability that the ball drawn at stage m
is green and the ball drawn at stage n is red? Does the answer depend
on m and n?
[Hint]