Will Monroe with materials by
July 24, 2017 Mehran Sahami
and Chris Piech
Independent random variables
Announcements: Midterm
Tomorrow!
Tuesday, July 25, 7:00-9:00pm
Building 320-105
(main quad, Geology Corner)
One page of notes (front & back)
No books/computers/calculators
Review: Joint distributions
A joint distribution combines
multiple random variables. Its PDF
or PMF gives the probability or
relative likelihood of both random
variables taking on specific values.
p X ,Y (a , b)=P ( X =a ,Y =b)
Review: Joint PMF
A joint probability mass function
gives the probability of more than
one discrete random variable each
taking on a specific value (an AND
of the 2+ values).
p X ,Y (a , b)=P ( X =a ,Y =b)
Y
0 1 2
0 0.05 0.20 0.10
X 1 0.10 0.10 0.10
2 0.05 0.10 0.20
Review: Joint PDF
A joint probability density function
gives the relative likelihood of
more than one continuous
random variable each taking on a
specific value.
P (a1 < X ≤a2, b1 <Y ≤b2 )=
a2 b2
∫ dx ∫ dy f X ,Y ( x , y )
a1 b1
Review: Joint CDF
to 1 as
x → +∞,
F X , Y (x , y)=P( X ≤x , Y ≤ y ) y → +∞,
to 0 as x
x → -∞,
y → -∞, plot by Academo
Probabilities from joint CDFs
P (a1 < X ≤a2, b1 <Y ≤b2 )=F X ,Y (a2, b2 )
−F X ,Y (a1, b2 )
−F X ,Y (a2, b1 )
+ F X ,Y (a1, b1 )
b2
b1
a1 a2
Review: Marginalization
Marginal probabilities give the
distribution of a subset of the variables
(often, just one) of a joint distribution.
Sum/integrate over the variables you
don’t care about.
p X (a)=∑ p X ,Y (a , y)
y
∞
f X (a)= ∫ dy f X ,Y (a , y)
−∞
Review: Non-negative RV
expectation lemma
FY ( y )
You can integrate y
times the PMF, or
you can integrate 1
minus the CDF!
∞
E [Y ]=∫ dy P(Y > y) y
0
∞
=∫ dy (1−F Y ( y))
0
Non-negative RV expectation lemma:
Rearranging terms
E [ X ]=0 P ( X=0)+1 P( X =1)+2 P( X =2)+3 P( X =3)+⋯
=0 P( X =0)+1 P( X =1)+2 P( X =2)+3 P ( X =3)+⋯
+1 P( X1 =1)+2 P( X =2)+3 P ( X =3)+⋯
+1 P( X =1)+2 P( X2 =2)+3 P ( X =3)+⋯
3...
=0 P( X =0)+1 P( X =1)+2 P( X =2)+3 P ( X =3)+⋯
+1 P( X =1)+2 P( X =2)+3 P ( X =3)+⋯
+1 P( X =1)+2 P( X =2)+3 P ( X =3)+⋯
=0 P( X =0)+1 P( X ≥1)
+1 P( X =1)+2 P( X ≥2)
∞ +1 P( X =1)+2 P( X =2)+3 P ( X ≥3)+⋯
=∑ P( X ≥i)
i=1
[Addendum]
Non-negative RV expectation lemma:
Graphically
0.3
0.25
E [ X ]=∑ 0.2
p_X(x)
0.15
0.1
0.05 P(X ≥ 4)
P(X ≥ 3)
P(X ≥ 2)
0 P(X ≥ 1)
0 P(X=0) 1 P(X=1) 2 P(X=2) 3 P(X=3) 4 P(X=4)
x
[Addendum]
Review: Multinomial random variable
An multinomial random variable
records the number of times each
outcome occurs, when an
experiment with multiple outcomes
(e.g. die roll) is run multiple times.
X 1 ,…, X m∼MN (n , p1, p2, …, p m)
P ( X 1 =c 1 , X 2 =c 2 ,…, X m=c m)
vector! n c c c
=
(
c 1 , c 2 ,…, c m )
p1 p2 … p m
1 2 m
A question from last class
p X ,Y (a , b)=P ( X =a ,Y =b)
Y
0 1 2
0 0.05 0.20 0.10
X 1 0.10 0.10 0.10
2 0.05 0.10 0.20
“Are X and Y independent?”
P ( X =0, Y =0)=P ( X =0) P (Y =0)
Independence of
discrete random variables
Two random variables are
independent if knowing the value
of one tells you nothing about the
value of the other (for all values!).
X ⊥Y iff ∀ x , y :
P ( X = x , Y = y)=P ( X =x) P (Y = y)
- or -
p X ,Y (x , y )= p X ( x) pY ( y )
Coin flips
n flips m flips
X: number of heads Y: number of heads
in first n flips in next m flips
n x n−x m y m− y
()
P( X =x , Y = y)= p (1− p)
x y ()
p (1− p)
=P( X =x) P(Y = y)
∴ X ⊥Y
Coin flips
n flips m flips
X: number of heads
in first n flips
Z: total number of
heads in n + m flips
Z=0→ X =0
X ⟂Z
Web server hits
Your web server gets N requests in
a day. N ~ Poi(λ).
Each request comes independently
from human (prob. p) or bot (1 – p).
X: # requests from humans in day
Y: # requests from bots in day
Knowing N:
X∼Bin ( N , p)
Y ∼Bin ( N ,1− p)
i+ j
= i+ j p (1− p) λ
i j
( )
i
=e
−λ
(i+ j)!
P( X =i ,Y = j)=P ( X =i ,Y = j|N =i+ j)P ( N =i+ j)
+ P ( X =i ,Y = j|N ≠i+ j) P( N ≠i+ j)
Web server hits
i+ j
i+ j i j −λ λ
P( X =i ,Y = j)=
( )
i
p (1− p) e
(i+ j)!
(i+ j)! i j −λ λ λ
i j
= p (1− p) e
i! j! (i+ j)!
i i j j
−λ p λ (1− p) λ
=e
i! j!
i j
−λ p (λ p) −λ (1− p) [λ (1− p)]
=e e
i! j!
X∼Poi (λ p)
Y ∼Poi (λ (1− p))
=P ( X =i) P(Y = j)
∴ X ⊥Y
Independence of
continuous random variables
Two random variables are
independent if knowing the value
of one tells you nothing about the
value of the other (for all values!).
X ⊥Y iff ∀ x , y :
f X ,Y ( x , y )=f X ( x) f Y ( y)
- or -
f X ,Y (x , y )=g(x)h( y )
- or -
F X ,Y (x , y)=F X (x) F Y ( y)
Density functions and independence
g(x) h(y)
−3 x −2 y
f X ,Y (x , y)=6 e e for 0<{x , y }<∞
X ⊥Y : yes!
g(x) h(y)
f X ,Y (x , y)=4 x y for 0<{x , y }<1
X ⊥Y : yes!
f X ,Y (x , y)=4 x y for 0< x <1− y <1
X ⊥Y : no!https://bit.ly/1a2ki4G → https://b.socrative.com/login/student/
Room: CS109SUMMER17
The Joy of Meetings
2 people set up a meeting for 12pm.
Each arrives independently, uniformly
between 12:00 and 12:30.
X ~ Uni(0, 30): mins after 12 for person 1
Y ~ Uni(0, 30): mins after 12 for person 2
P(first to arrive waits > 10 minutes for the other) = ?
P( X +10<Y )+ P(Y +10< X )=2 P( X +10<Y )
(symmetry)
=2 ∬ dx dy f X ,Y ( x , y)
x , y : x+10< y
The Joy of Meetings
(symmetry)
P( X +10<Y )+ P(Y +10< X )=2 P( X +10<Y )
=2 ∬ dx dy f X ,Y ( x , y)
x , y : x+10< y (independence)
=2 ∬ dx dy f X ( x) f Y ( y)
x , y : x+10< y
30 y−10 2
1
=2 ∫
y=10
dy ∫
x=0
dx ( )
30
30
2
= 2 ∫ dy ( y−10)
30 y=10
2 30 2 2
2 y
= 2
30 2 [
−10 y ] y=10
[(
2 30
= 2
30 2
−300 −
10
)(
2
−100 = )]
4
9
Independence of
continuous random variables
Two random variables are
independent if knowing the value
of one tells you nothing about the
value of the other (for all values!).
X ⊥Y iff ∀ x , y :
f X ,Y ( x , y )=f X ( x) f Y ( y)
- or -
f X ,Y (x , y )=g(x)h( y )
- or -
F X ,Y (x , y)=F X (x) F Y ( y)
Setting records
Let X₁, X₂, … be a sequence of
independent and identically distributed
(I.I.D.) continuous random variables.
“record value”: an Xₙ that beats all previous Xᵢ
⇒ Xₙ = max(X₁, …, Xₙ)
Aᵢ: event that Xᵢ is a “record value”
A n+1 ⊥ A n ?
https://bit.ly/1a2ki4G → https://b.socrative.com/login/student/
Room: CS109SUMMER17
Independence is symmetric
X ⊥Y ⇔ Y ⊥ X
E⊥ F ⇔ F⊥ E
Let X₁, X₂, … be a sequence of
independent and identically distributed
(I.I.D.) continuous random variables.
“record value”: an Xₙ that beats all previous Xᵢ
⇒ Xₙ = max(X₁, …, Xₙ)
Aᵢ: event that Xᵢ is a “record value”
P( A n A n+1 )
A n+1 ⊥ A n ? = ⋅
1 1
A n ⊥ A n+1 ? yes! n n+1
=P ( A n ) P( A n+1)
https://bit.ly/1a2ki4G → https://b.socrative.com/login/student/
Room: CS109SUMMER17
Break time!
Sum of independent binomials
n flips m flips
X: number of heads Y: number of heads
in first n flips in next m flips
X∼Bin (n , p) Y ∼Bin (m , p)
X +Y ∼Bin (n+m , p)
More generally:
N N
X i ∼Bin (ni , p) ⇒
all X i independent
(
∑ X i∼Bin ∑ ni , p
i=1 i=1
)
Sum of independent Poissons
λ₁ chips/cookie λ₂ chips/cookie
X: number of chips Y: number of chips
in first cookie in second cookie
X∼Poi (λ 1 ) Y ∼Poi (λ 2 )
X +Y ∼Poi(λ 1 +λ 2 )
More generally:
N N
X i ∼Poi (λ i ) ⇒
all X i independent
∑ X i ∼Poi ∑ λ i
i=1
( )
i=1
Convolution
A convolution is the
distribution of the sum of two
independent random variables.
∞
f X +Y (a)= ∫ dy f X (a− y) f Y ( y)
−∞
Dance Dance Convolution
X, Y: independent discrete random variables
(law of total probability)
P( X +Y =a)=∑ P( X +Y =a , Y = y)
y
=∑ P( X=a− y) P(Y = y )
y
X, Y: independent continuous random variables
∞
f X+Y (a)= ∫ dy f X (a− y) f Y ( y)
−∞
Blurring a photo
images: Stig Nygaard (left), Daniel Paxton (right)
Sum of independent uniforms
1 1
0 0
0 1 0 1
X∼Uni (0 ,1) Y ∼Uni (0,1)
∞
f X+Y (a)= ∫ dy f X (a− y) f Y ( y)
−∞
1
1
=∫ dy f X (a− y) f Y ( y)
0
Case 1: if 0 ≤ a ≤ 1, then we need 0 ≤ y ≤ a (for a – y to be in [0, 1])
Case 2: if 1 ≤ a ≤ 2, then we need a – 1 ≤ y ≤ 1
a
{ ∫0 dy⋅1=a 0≤a≤1 1
1
=
∫a−1 dy⋅1=2−a 1≤a≤2
0
0 otherwise 0 1 2
Sum of independent normals
2 2
X∼N (μ 1 ,σ 1 ) Y ∼N (μ 2 , σ 2 )
2 2
X +Y ∼N (μ 1+μ 2 , σ 1 +σ ) 2
More generally:
N N N
all X i independent
2
X i ∼N (μ i , σ i ) ⇒ (
∑ X i∼ N ∑ μ i , ∑ σ i
i=1 i=1 i=1
2
)
Virus infections
150 computers in a dorm:
50 Macs (each independently
infected with probability 0.1)
100 PCs (each independently
infected with probability 0.4)
What is P(≥ 40 machines infected)?
M: # infected Macs M ∼Bin (50 , 0.1)≈ X ∼N (5, 4.5)
P: # infected PCs P∼Bin (100 , 0.4)≈Y ∼N (40, 24)
P( M + P≥40)≈P( X +Y ≥39.5)
W =X +Y ∼N (5+40, 4.5+24)=N (45, 28.5)
W −45 39.5−45
P(W ≥39.5)=P
(
√ 28.5
≥
√ 28.5 )
≈1−Φ(−1.03)≈0.8485