CL202: Introduction to Data Analysis
MB+SCP
Notes and slides made by: Mani Bhushan, Sachin Patwardhan
Department of Chemical Engineering,
Indian Institute of Technology Bombay
Mumbai, India- 400076
mbhushan,sachinp@iitb.ac.in
Instructor: Sharad Bhartiya (bhartiya@che.iitb.ac.in)
January 25, 2022
MB+SCP (IIT Bombay) CL202 January 25, 2022 1 / 57
This handout
Multiple Random Variables
Joint, marginal, conditional distribution and density
functions
Independence
MB+SCP (IIT Bombay) CL202 January 25, 2022 2 / 57
Extension of Ideas:
Multiple (Multivariate) Random Variables: Jointly
distributed random variables
Event ω occurs in sample space S. Associate many,
X1 , X2 , ..., Xn , random variables with ω.
Each random variable is a valid mapping from S to
R.
MB+SCP (IIT Bombay) CL202 January 25, 2022 3 / 57
Bivariate Random Variables
For simplicity of notation consider two random
variables: X , Y .
Special case of multiple random variables.
Examples:
I Average number of cigarettes smoked daily and the age
at which an individual gets cancer,
I Height and weight of an individual,
I Height and IQ of an individual.
I Flow-rate and pressure drop of a liquid flowing through a
pipe.
I Number of heads and number of tails in an experiment
involving toss of several coins.
MB+SCP (IIT Bombay) CL202 January 25, 2022 4 / 57
Jointly distributed random variables
Often interested in answering questions on X , Y taking
values in a specified region D in R2 (xy plane).
The distribution functions FX (x) and FY (y ) of X
and Y determine their individual probabilities but
not their joint probabilities. The probability of event
{ω : X (ω) ≤ x} ∩ {ω : Y (ω) ≤ y }
= {ω : X (ω) ≤ x, Y (ω) ≤ y }
cannot be expressed in terms of FX (x) and FY (y ).
Joint probabilities of X , Y completely determined if
probability of above event known for every x and y .
MB+SCP (IIT Bombay) CL202 January 25, 2022 5 / 57
Joint Probability Distribution Function or
Joint Cumulative Distribution Function
For random variables (discrete or continuous) X , Y , the
joint (bivariate) probability distribution function is:
FX ,Y (x, y ) = P{X ≤ x, Y ≤ y }
where x, y are two arbitrary real numbers.
Often, the subscript X , Y omitted.
MB+SCP (IIT Bombay) CL202 January 25, 2022 6 / 57
Properties of Joint Probability Distribution
Function (Papoulis and Pillai, 2002)
1 F (−∞, y ) = F (x, −∞) = 0, F (∞, ∞) = 1.
2
P(x1 < X ≤ x2 , Y ≤ y ) = F (x2 , y ) − F (x1 , y )
P(X ≤ x, y1 < Y ≤ y2 ) = F (x, y2 ) − F (x, y1 )
3
P(x1 < X ≤ x2 , y1 < Y ≤ y2 ) =
F (x2 , y2 ) − F (x1 , y2 ) − F (x2 , y1 ) + F (x1 , y1 )
MB+SCP (IIT Bombay) CL202 January 25, 2022 7 / 57
Joint Density Function: I
The joint density of X and Y is the function (defn.)
∂ 2 F (x, y )
f (x, y ) =
∂x∂y
It follows that,
Z x Z y
F (x, y ) = f (ξ, ρ)dξdρ
Z−∞
Z −∞
P((X , Y ) ∈ D) = f (x, y )dxdy
D
MB+SCP (IIT Bombay) CL202 January 25, 2022 8 / 57
Joint Density Function: II
In particular, as ∆x → 0 and ∆y → 0,
P(x < X ≤ x+∆x, y < Y ≤ y +∆y ) ≈ f (x, y )∆x∆y
R∞ R∞
−∞ −∞ f (x, y )dxdy = 1; f (x, y ) ≥ 0 ∀x, y ∈ R.
MB+SCP (IIT Bombay) CL202 January 25, 2022 9 / 57
Joint Density Example: Bivariate Gaussian
Random Variable
f (x, y ) = α exp(−0.5(ξ − µ)T P −1 (ξ − µ))
with
x 1 0.9 0.4
ξ= , µ= , P= ,
y −1 0.4 0.3
1
α= p
2π |P|
MB+SCP (IIT Bombay) CL202 January 25, 2022 10 / 57
Joint Density Visualization
MB+SCP (IIT Bombay) CL202 January 25, 2022 11 / 57
Joint Distribution Visualization
MB+SCP (IIT Bombay) CL202 January 25, 2022 12 / 57
Marginal Distribution or Density Functions
of Individual Random Variables: I
Marginal Probability Distribution Functions:
FX (x), FY (y ):
I Extract FX (x) from F (x, y ) as:
FX (x) = P(X ≤ x) =
= P(X ≤ x, Y < ∞) = F (x, ∞)
I Similarly, extract FY (y ) as:
FY (y ) = P(Y ≤ y ) = P(X < ∞, Y ≤ y ) = F (∞, y )
MB+SCP (IIT Bombay) CL202 January 25, 2022 13 / 57
Marginal Distribution or Density Functions
of Individual Random Variables: II
Marginal Probability Density Functions:
fX (x), fY (y ):
I Extract these from f (x, y ) as:
Z ∞
fX (x) = f (x, y )dy
−∞
Z ∞
fY (y ) = f (x, y )dx
−∞
MB+SCP (IIT Bombay) CL202 January 25, 2022 14 / 57
Marginal Probability Density
Z ∞
fX (x) = f (x, y )dy
−∞
Makes sense, since
P(X ∈ A) = P(X ∈ A, Y ∈ (−∞, ∞))
Z Z ∞ Z
= f (x, y )dydx = fX (x)dx
A −∞ A
where fX (x) is as defined above. Similarly,
Z ∞
fY (y ) = f (x, y )dx
−∞
MB+SCP (IIT Bombay) CL202 January 25, 2022 15 / 57
Example 4.3c from Ross: I
2e −x e −2y 0 < x < ∞, 0 < y < ∞
f (x, y ) =
0 otherwise
Compute: (a) P(X > 1, Y < 1), (b) P(X < Y ), (c)
P(X < a)
Z 1Z ∞
P(X > 1, Y < 1) = 2e −x e −2y dxdy
Z0 1 1
= 2e −2y (−e −x |∞1 )dy
0
Z 1
= e −1 2e −2y dy = e −1 (1 − e −2 )
0
MB+SCP (IIT Bombay) CL202 January 25, 2022 16 / 57
Example 4.3c from Ross: II
Z ∞Z y
P(X < Y ) = 2e −x e −2y dxdy = 1/3
Z0 a Z 0
∞
P(X < a) = 2e −x e −2y dydx = 1 − e −a
0 0
MB+SCP (IIT Bombay) CL202 January 25, 2022 17 / 57
Joint Density Visualization: Exponential
MB+SCP (IIT Bombay) CL202 January 25, 2022 18 / 57
Joint Distribution Visualization:
Exponential
MB+SCP (IIT Bombay) CL202 January 25, 2022 19 / 57
Joint Probability Mass Function (PMF)
Given two discrete random variables X and Y in the
same experiment, the joint PMF of X and Y is
p(xi , yj ) = P(X = xi , Y = yj )
for all pairs of (xi , yj ) values that X and Y can take.
p(xi , yj ) also denoted as pX ,Y (xi , yj ).
The marginal probability mass functions for X and
Y are
X
pX (x) = P(X = x) = pX ,Y (x, y )
y
X
pY (y ) = P(Y = y ) = pX ,Y (x, y )
x
MB+SCP (IIT Bombay) CL202 January 25, 2022 20 / 57
Computation of Marginal PMF from Joint
PMF: I
Formally:
[
{X = xi } = {X = xi , Y = yj }
j
All events on RHS are mutually exclusive. Thus,
X
pX (xi ) = P(X = xi ) = P(X = xi , Y = yj )
j
X
= p(xi , yj )
j
MB+SCP (IIT Bombay) CL202 January 25, 2022 21 / 57
Computation of Marginal PMF from Joint
PMF: II
Similarly, P
pY (yj ) = P(Y = yj ) = p(xi , yj ).
i
Note: P(X = xi , Y = yj ) cannot be constructed from
knowledge of P(X = xi ) and P(Y = yj ).
MB+SCP (IIT Bombay) CL202 January 25, 2022 22 / 57
Example: 4.3a, Ross
3 batteries are randomly chosen from a group of 3 new,
4 used but still working, and 5 defective batteries. Let
X , Y denote the number of new, and used but working
batteries that are chosen, respectively. Find
p(xi , yj ) = P(X = xi , Y = yj ).
Solution: Let T =12 C3
p(0, 0) = (5 C3 )/T
p(0, 1) = (4 C1 )(5 C2 )/T
p(0, 2) = (4 C2 )(5 C1 )/T
p(0, 3) = (4 C3 )/T
p(1, 0) = (3 C1 )(5 C2 )/T
p(1, 1) = (3 C1 )(4 C1 )(5 C1 )/T
p(1, 2) = ..., p(2, 0) = ..., p(2, 1) = ..., p(3, 0) = ...
MB+SCP (IIT Bombay) CL202 January 25, 2022 23 / 57
Tabular Form
0 1 2 3 Row Sum
(P(X = i))
0 10/220 40/220 30/220 4/220 84/220
1 30/220 60/220 18/220 0 108/220
2 15/220 12/220 0 0 27/220
3 1/220 0 0 0 1/220
Col sum 56/220 112/220 48/220 4/220
(P(Y = j))
i represents row and j represents column:
Both row and column sums add upto 1.
Marginal probabilities in the margins of the table.
MB+SCP (IIT Bombay) CL202 January 25, 2022 24 / 57
n Random Variables: I
Joint cumulative probability distribution function
F (x1 , x2 , ..., xn ) of n random variables X1 , X2 , ..., Xn
is defined as:
F (x1 , x2 , ..., xn ) = P(X1 ≤ x1 , X2 ≤ x2 , ..., Xn ≤ xn )
If random vars. discrete: joint probability mass
function
p(x1 , x2 , ..., xn ) = P(X1 = x1 , X2 = x2 , ..., Xn = xn )
MB+SCP (IIT Bombay) CL202 January 25, 2022 25 / 57
n Random Variables: II
If random vars. continuous: joint probability density
function f (x1 , x2 , ..., xn ) such that for any set C in
n-dimensional space
P((X1 , X2 , ..., Xn ) ∈ C ) =
Z Z Z
. . . f (x1 , x2 , ..., xn )dx1 dx2 ...dxn
(x1 ,...,xn )∈C
where,
∂ n F (x1 , x2 , . . . , xn )
f (x1 , x2 , ..., xn ) =
∂x1 ∂x2 . . . ∂xn
MB+SCP (IIT Bombay) CL202 January 25, 2022 26 / 57
Obtaining Marginals
FX1 (x1 ) = F (x1 , ∞, ∞, . . . , ∞)
Z ∞Z ∞ Z ∞
fX1 (x1 ) = ... f (x1 , x2 , . . . , xn )dx2 dx3 ...dxn
−∞ −∞
X X X−∞
pX1 (x1 ) = ... p(x1 , x2 , . . . , xn )
x2 x3 xn
MB+SCP (IIT Bombay) CL202 January 25, 2022 27 / 57
Independence of Random Variables: I
Random variables X and Y are independent if for
any two sets of real numbers A and B:
P(X ∈ A, Y ∈ B) =
MB+SCP (IIT Bombay) CL202 January 25, 2022 28 / 57
Independence of Random Variables: I
Random variables X and Y are independent if for
any two sets of real numbers A and B:
P(X ∈ A, Y ∈ B) = P(X ∈ A)P(Y ∈ B)
i.e. events EA = {X ∈ A} and EB = {Y ∈ B} are
independent.
Height and IQ
In particular: P(X ≤ a, Y ≤ b) =
P(X ≤ a)P(Y ≤ b), or
In terms of joint cumulative distribution function F
of X and Y :
F (a, b) = FX (a)FY (b); ∀a, b ∈ R
MB+SCP (IIT Bombay) CL202 January 25, 2022 28 / 57
Independence of Random Variables: II
Random variables that are not independent are called
dependent
MB+SCP (IIT Bombay) CL202 January 25, 2022 29 / 57
Independence: Probability Mass and
Density Functions
Random variables X , Y independent if:
Discrete random variables: Probability mass function
p(xi , yj ) = pX (xi )pY (yj ) for all xi , yj
Continuous random variables: Probability density
function
f (x, y ) = fX (x)fY (y ) for all x, y
MB+SCP (IIT Bombay) CL202 January 25, 2022 30 / 57
Independence: Equivalent Statements
1) P(X ∈ A, Y ∈ B) = P(X ∈ A)P(Y ∈ B);
∀A, B sets in R
2) F (x, y ) = FX (x)FY (y ); ∀x, y
3) f (x, y ) = fX (x)fY (y ); ∀x, y ; continuous RVs
3) p(xi , yj ) = pX (xi )pY (yj ); ∀xi , yj ; discrete RVs
MB+SCP (IIT Bombay) CL202 January 25, 2022 31 / 57
Example 5.2 (Ogunnaike, 2009): I
The reliability of the temperature control system for a
commercial, highly exothermic polymer reactor is known
to depend on the lifetimes (in years) of the control
hardware electronics, X1 , and of the control valve on the
cooling water line, X2 . If one component fails, the entire
control system fails. The random phenomenon in
question is characterized by the two-dimensional random
variable (X1 , X2 ) whose joint probability distribution is
given as:
1 −(0.2x +0.1x )
e 1 2
, 0 < x1 < ∞, 0 < x2 < ∞
f (x1 , x2 ) = 50
0, otherwise
MB+SCP (IIT Bombay) CL202 January 25, 2022 32 / 57
Example 5.2 (Ogunnaike, 2009): II
1 Establish that above is a legitimate joint probability
density function,
R∞R∞
To show: 0 0 f (x1 , x2 )dx1 dx2 = 1.
MB+SCP (IIT Bombay) CL202 January 25, 2022 33 / 57
Example 5.2 (Ogunnaike, 2009): II
1 Establish that above is a legitimate joint probability
density function,
R∞R∞
To show: 0 0 f (x1 , x2 )dx1 dx2 = 1.
Z ∞Z ∞
1 −(0.2x1 +0.1x2 )
e dx1 dx2
0 0 50
1
= (−5e −0.2x1 |∞ 0 )(−10e
−0.1x2 ∞
|0 ) = 1
50
MB+SCP (IIT Bombay) CL202 January 25, 2022 33 / 57
Example 5.2 (Ogunnaike, 2009): III
1 Whats the probability of the system lasting more
than 2 years.
To find:
MB+SCP (IIT Bombay) CL202 January 25, 2022 34 / 57
Example 5.2 (Ogunnaike, 2009): III
1 Whats the probability of the system lasting more
than 2 years.
R ∞ 1P(X
RTo∞ find: 1 > 2, X2 > 2) =
−(0.2x1 +0.1x2 )
2 2 50 e dx1 dx2 = 0.549.
MB+SCP (IIT Bombay) CL202 January 25, 2022 34 / 57
Example 5.2 (Ogunnaike, 2009): III
1 Whats the probability of the system lasting more
than 2 years.
R ∞ 1P(X
RTo∞ find: 1 > 2, X2 > 2) =
−(0.2x1 +0.1x2 )
2 2 50 e dx1 dx2 = 0.549.
2 Find marginal density function of X1 .
MB+SCP (IIT Bombay) CL202 January 25, 2022 34 / 57
Example 5.2 (Ogunnaike, 2009): III
1 Whats the probability of the system lasting more
than 2 years.
R ∞ 1P(X
RTo∞ find: 1 > 2, X2 > 2) =
−(0.2x1 +0.1x2 )
2 2 50 e dx1 dx2 = 0.549.
2 Find marginal density function of X1 .
Z ∞
1 −(0.2x1 +0.1x2 ) 1
fX1 (x1 ) = e dx2 = e −(0.2x1 )
0 50 5
MB+SCP (IIT Bombay) CL202 January 25, 2022 34 / 57
Example 5.2 (Ogunnaike, 2009): III
1 Whats the probability of the system lasting more
than 2 years.
R ∞ 1P(X
RTo∞ find: 1 > 2, X2 > 2) =
−(0.2x1 +0.1x2 )
2 2 50 e dx1 dx2 = 0.549.
2 Find marginal density function of X1 .
Z ∞
1 −(0.2x1 +0.1x2 ) 1
fX1 (x1 ) = e dx2 = e −(0.2x1 )
0 50 5
3 Find marginal density function of X2 ?
Z ∞
1 −(0.2x1 +0.1x2 ) 1
fX2 (x2 ) = e dx1 = e −(0.1x2 )
0 50 10
MB+SCP (IIT Bombay) CL202 January 25, 2022 34 / 57
Example 5.2 (Ogunnaike, 2009): IV
4 Are X1 , X2 independent?
MB+SCP (IIT Bombay) CL202 January 25, 2022 35 / 57
Example 5.2 (Ogunnaike, 2009): IV
4 Are X1 , X2 independent? Yes, since
f (x1 , x2 ) = fX1 (x1 )fX2 (x2 ).
MB+SCP (IIT Bombay) CL202 January 25, 2022 35 / 57
Independence of n Random Variables: I
Random variables X1 , X2 , ..., Xn are said to be
independent if
For all sets of real numbers A1 , A2 , ..., An :
Yn
P(X1 ∈ A1 , X2 ∈ A2 , ..., Xn ∈ An ) = P(Xi ∈ Ai )
i=1
MB+SCP (IIT Bombay) CL202 January 25, 2022 36 / 57
Independence of n Random Variables: I
Random variables X1 , X2 , ..., Xn are said to be
independent if
For all sets of real numbers A1 , A2 , ..., An :
Yn
P(X1 ∈ A1 , X2 ∈ A2 , ..., Xn ∈ An ) = P(Xi ∈ Ai )
i=1
In particular: ∀a1 , a2 , ..., an ∈ R
n
Y
P(X1 ≤ a1 , X2 ≤ a2 , ..., Xn ≤ an ) = P(Xi ≤ ai ),
i=1
Yn
or, F (a1 , a2 , ..., an ) = FXi (ai )
i=1
MB+SCP (IIT Bombay) CL202 January 25, 2022 36 / 57
Independence of n Random Variables: II
For discrete random variables: probability mass
function factorizes:
p(x1 , x2 , ..., xn ) = pX 1 (x1 )pX 2 (x2 )...pXn (xn )
For continuous random variables: probability density
function factorizes:
f (x1 , x2 , ..., xn ) = fX1 (x1 )fX2 (x2 )...fXn (xn )
MB+SCP (IIT Bombay) CL202 January 25, 2022 37 / 57
Independent, Repeated Trials
In statistics, one usually does not consider just a
single experiment, but that the same experiment is
performed several times.
MB+SCP (IIT Bombay) CL202 January 25, 2022 38 / 57
Independent, Repeated Trials
In statistics, one usually does not consider just a
single experiment, but that the same experiment is
performed several times.
Associate a separate random variable with each of
those experimental outcomes.
If the experiments are independent of each other,
then we get a set of independent random variables.
MB+SCP (IIT Bombay) CL202 January 25, 2022 38 / 57
Independent, Repeated Trials
In statistics, one usually does not consider just a
single experiment, but that the same experiment is
performed several times.
Associate a separate random variable with each of
those experimental outcomes.
If the experiments are independent of each other,
then we get a set of independent random variables.
Example: Tossing a coin n times. Random variable
Xi is the outcome (0 or 1) in the i th toss.
MB+SCP (IIT Bombay) CL202 January 25, 2022 38 / 57
Independent and Identically Distributed
(IID) Variables: I
A collection of random variables is said to be IID if
The variables are independent
The variables have the same probability distribution
MB+SCP (IIT Bombay) CL202 January 25, 2022 39 / 57
Independent and Identically Distributed
(IID) Variables: I
A collection of random variables is said to be IID if
The variables are independent
The variables have the same probability distribution
Example 1: Tossing a coin n times. The probability
of obtaining a head in a single toss does not vary
and all the tosses are independent.
I Each toss leads to a random variable with the same
probability distribution function. The random variables
are also independent. Thus, IID.
MB+SCP (IIT Bombay) CL202 January 25, 2022 39 / 57
Independent and Identically Distributed
(IID) Variables: II
Example 2: Measuring temperature of a beaker at n
time instances in the day. The true water
temperature changes throughout the day. The
sensor is noisy.
I Each sensor reading leads to a random variable.
I Variables are independent but not identically distributed.
MB+SCP (IIT Bombay) CL202 January 25, 2022 40 / 57
Conditional Distributions
Remember for two events A and B: conditional
probability of A given B is:
P(A, B)
P(A | B) =
P(B)
for P(B) > 0.
MB+SCP (IIT Bombay) CL202 January 25, 2022 41 / 57
Conditional Probability Mass Function
For X , Y discrete random variables, define the
conditional probability mass function of X given
Y = y by
pX |Y (x|y ) = P(X = x | Y = y )
P(X = x, Y = y ) p(x, y )
= =
P(Y = y ) pY (y )
for pY (y ) > 0.
MB+SCP (IIT Bombay) CL202 January 25, 2022 42 / 57
Examples 4.3b,f from Ross: I
Question: In a community, 15% families have no
children, 20% have 1, 35% have 2 and 30% have 3
children. Each child is equally likely to be a boy or girl.
We choose a family at random. Given that the chosen
family has one girl, compute the probability mass
function of the number of boys in the family?
MB+SCP (IIT Bombay) CL202 January 25, 2022 43 / 57
Examples 4.3b,f from Ross: I
Question: In a community, 15% families have no
children, 20% have 1, 35% have 2 and 30% have 3
children. Each child is equally likely to be a boy or girl.
We choose a family at random. Given that the chosen
family has one girl, compute the probability mass
function of the number of boys in the family?
G: number of girls, B: number of boys, C: number of
children
MB+SCP (IIT Bombay) CL202 January 25, 2022 43 / 57
Examples 4.3b,f from Ross: I
Question: In a community, 15% families have no
children, 20% have 1, 35% have 2 and 30% have 3
children. Each child is equally likely to be a boy or girl.
We choose a family at random. Given that the chosen
family has one girl, compute the probability mass
function of the number of boys in the family?
G: number of girls, B: number of boys, C: number of
children
To find:
MB+SCP (IIT Bombay) CL202 January 25, 2022 43 / 57
Examples 4.3b,f from Ross: I
Question: In a community, 15% families have no
children, 20% have 1, 35% have 2 and 30% have 3
children. Each child is equally likely to be a boy or girl.
We choose a family at random. Given that the chosen
family has one girl, compute the probability mass
function of the number of boys in the family?
G: number of girls, B: number of boys, C: number of
children
To find: P(B = i|G = 1), i = 0, 1, 2, 3.
P(B = i, G = 1)
P(B = i|G = 1) = , i = 0, 1, 2, 3
P(G = i)
MB+SCP (IIT Bombay) CL202 January 25, 2022 43 / 57
Examples 4.3b,f from Ross: I
Question: In a community, 15% families have no
children, 20% have 1, 35% have 2 and 30% have 3
children. Each child is equally likely to be a boy or girl.
We choose a family at random. Given that the chosen
family has one girl, compute the probability mass
function of the number of boys in the family?
G: number of girls, B: number of boys, C: number of
children
To find: P(B = i|G = 1), i = 0, 1, 2, 3.
P(B = i, G = 1)
P(B = i|G = 1) = , i = 0, 1, 2, 3
P(G = i)
MB+SCP (IIT Bombay) CL202 January 25, 2022 43 / 57
Examples 4.3b,f from Ross: II
First find P(G = 1)
{G = 1} = {G = 1} ∩ ({C = 0} ∪ {C = 1} ∪ {C = 2}
∪ {C = 3})
P(G = 1) = P(G = 1, C = 0) + P(G = 1, C = 1)
+ P(G = 1, C = 2) + P(G = 1, C = 3)
MB+SCP (IIT Bombay) CL202 January 25, 2022 44 / 57
Examples 4.3b,f from Ross: II
First find P(G = 1)
{G = 1} = {G = 1} ∩ ({C = 0} ∪ {C = 1} ∪ {C = 2}
∪ {C = 3})
P(G = 1) = P(G = 1, C = 0) + P(G = 1, C = 1)
+ P(G = 1, C = 2) + P(G = 1, C = 3)
since C = 0, C = 1, C = 2, C = 3 are mutually exclusive
events with union a S.
MB+SCP (IIT Bombay) CL202 January 25, 2022 44 / 57
Examples 4.3b,f from Ross: II
First find P(G = 1)
{G = 1} = {G = 1} ∩ ({C = 0} ∪ {C = 1} ∪ {C = 2}
∪ {C = 3})
P(G = 1) = P(G = 1, C = 0) + P(G = 1, C = 1)
+ P(G = 1, C = 2) + P(G = 1, C = 3)
since C = 0, C = 1, C = 2, C = 3 are mutually exclusive
events with union a S.
Then,
P(G = 1) = P(G = 1 | C = 0)P(C = 0)
+ P(G = 1 | C = 1)P(C = 1) + ...
= 0 + (1/2) × 0.2 + ... = 0.3875
MB+SCP (IIT Bombay) CL202 January 25, 2022 44 / 57
Examples 4.3b,f from Ross: III
Then,
P(B = 0, G = 1)
P(B = 0 | G = 1) =
P(G = 1)
MB+SCP (IIT Bombay) CL202 January 25, 2022 45 / 57
Examples 4.3b,f from Ross: III
Then,
P(B = 0, G = 1)
P(B = 0 | G = 1) =
P(G = 1)
Numerator = P(G = 1andC = 1) = P(G = 1 | C =
1)P(C = 1) = (1/2)0.2 = 0.1. Then,
P(B = 0 | G = 1) = 0.1/0.3875 = 8/31
MB+SCP (IIT Bombay) CL202 January 25, 2022 45 / 57
Examples 4.3b,f from Ross: III
Then,
P(B = 0, G = 1)
P(B = 0 | G = 1) =
P(G = 1)
Numerator = P(G = 1andC = 1) = P(G = 1 | C =
1)P(C = 1) = (1/2)0.2 = 0.1. Then,
P(B = 0 | G = 1) = 0.1/0.3875 = 8/31
Similarly: P(B = 1 | G = 1) = 14/31, P(B = 2 | G =
1) = 9/31, P(B = 3 | G = 1) = 0.
MB+SCP (IIT Bombay) CL202 January 25, 2022 45 / 57
Examples 4.3b,f from Ross: III
Then,
P(B = 0, G = 1)
P(B = 0 | G = 1) =
P(G = 1)
Numerator = P(G = 1andC = 1) = P(G = 1 | C =
1)P(C = 1) = (1/2)0.2 = 0.1. Then,
P(B = 0 | G = 1) = 0.1/0.3875 = 8/31
Similarly: P(B = 1 | G = 1) = 14/31, P(B = 2 | G =
1) = 9/31, P(B = 3 | G = 1) = 0. Check: Sum of
conditional probabilities is 1.
MB+SCP (IIT Bombay) CL202 January 25, 2022 45 / 57
Conditional Probability Density Function
For Random Variables X , Y , conditional probability
density of X given that Y = y is defined as:
MB+SCP (IIT Bombay) CL202 January 25, 2022 46 / 57
Conditional Probability Density Function
For Random Variables X , Y , conditional probability
density of X given that Y = y is defined as:
f (x, y )
fX |Y (x | y ) =
fY (y )
for fY (y ) > 0.
MB+SCP (IIT Bombay) CL202 January 25, 2022 46 / 57
Conditional Probability Density Function
For Random Variables X , Y , conditional probability
density of X given that Y = y is defined as:
f (x, y )
fX |Y (x | y ) =
fY (y )
for fY (y ) > 0.
Hence, can make statements on probabilities of X taking
values in some set A given the value obtained by Y as:
MB+SCP (IIT Bombay) CL202 January 25, 2022 46 / 57
Conditional Probability Density Function
For Random Variables X , Y , conditional probability
density of X given that Y = y is defined as:
f (x, y )
fX |Y (x | y ) =
fY (y )
for fY (y ) > 0.
Hence, can make statements on probabilities of X taking
values in some set A given the value obtained by Y as:
Z
P(X ∈ A | Y = y ) = fX |Y (x | y )dx
A
MB+SCP (IIT Bombay) CL202 January 25, 2022 46 / 57
Independence and Conditional
Probabilities
If X , Y are independent, then
pX |Y (x|y ) = pX (x)
fX |Y (x|y ) = fX (x)
Independence and Conditional
Probabilities
If X , Y are independent, then
pX |Y (x|y ) = pX (x)
fX |Y (x|y ) = fX (x)
MB+SCP (IIT Bombay) CL202 January 25, 2022 47 / 57
Temperature Control Example
(Continued), Example 5.2 (Ogunnaike,
2009) Earlier
1 Find Conditional density function: fX1 |X2 (x1 |x2 ).
1
f (x1 , x2 )/fX2 (x2 ) = e −0.2x1
5
which is same as fX1 (x1 ) in this example.
2 Similarly, fX2 |X1 (x2 |x1 ) = fX2 (x2 ) in this example.
MB+SCP (IIT Bombay) CL202 January 25, 2022 48 / 57
Temperature Control Example
(Continued), Example 5.2 (Ogunnaike,
2009) Earlier
1 Find Conditional density function: fX1 |X2 (x1 |x2 ).
1
f (x1 , x2 )/fX2 (x2 ) = e −0.2x1
5
which is same as fX1 (x1 ) in this example.
2 Similarly, fX2 |X1 (x2 |x1 ) = fX2 (x2 ) in this example.
Generic Question: If fX1 |X2 (x1 |x2 ) = fX1 (x1 ), then is
fX2 |X1 (x2 |x1 ) = fX2 (x2 )?
MB+SCP (IIT Bombay) CL202 January 25, 2022 48 / 57
Temperature Control Example
(Continued), Example 5.2 (Ogunnaike,
2009) Earlier
1 Find Conditional density function: fX1 |X2 (x1 |x2 ).
1
f (x1 , x2 )/fX2 (x2 ) = e −0.2x1
5
which is same as fX1 (x1 ) in this example.
2 Similarly, fX2 |X1 (x2 |x1 ) = fX2 (x2 ) in this example.
Generic Question: If fX1 |X2 (x1 |x2 ) = fX1 (x1 ), then is
fX2 |X1 (x2 |x1 ) = fX2 (x2 )?
Answer: Yes
MB+SCP (IIT Bombay) CL202 January 25, 2022 48 / 57
Example 5.5 (Ogunnaike, 2009): I
x1 − x2 , 1 < x1 < 2, 0 < x2 < 1
fX1 ,X2 =
0, otherwise
Find: Conditional probability densities.
MB+SCP (IIT Bombay) CL202 January 25, 2022 49 / 57
Example 5.5 (Ogunnaike, 2009): I
x1 − x2 , 1 < x1 < 2, 0 < x2 < 1
fX1 ,X2 =
0, otherwise
Find: Conditional probability densities.
Answer: Compute marginals
(x1 − 0.5), 1 < x1 < 2
fX1 (x1 ) =
0, otherwise
(1.5 − x2 ), 0 < x2 < 1
fX2 (x2 ) =
0, otherwise
MB+SCP (IIT Bombay) CL202 January 25, 2022 49 / 57
Example 5.5 (Ogunnaike, 2009): II
Then compute conditionals
(x1 − x2 )
fX1 |X2 (x1 |x2 ) = , 1 < x1 < 2
(1.5 − x2 )
(x1 − x2 )
fX2 |X1 (x2 |x1 ) = , 0 < x2 < 1
(x1 − 0.5)
MB+SCP (IIT Bombay) CL202 January 25, 2022 50 / 57
Example 5.5 (Ogunnaike, 2009): II
Then compute conditionals
(x1 − x2 )
fX1 |X2 (x1 |x2 ) = , 1 < x1 < 2
(1.5 − x2 )
(x1 − x2 )
fX2 |X1 (x2 |x1 ) = , 0 < x2 < 1
(x1 − 0.5)
The random variables X1 , X2 are not independent.
MB+SCP (IIT Bombay) CL202 January 25, 2022 50 / 57
Plots
MB+SCP (IIT Bombay) CL202 January 25, 2022 51 / 57
Plots
MB+SCP (IIT Bombay) CL202 January 25, 2022 51 / 57
Independence of Transformations
If random variables X , Y are independent, then the
random variables
Z = g (X ), U = h(Y )
are also independent.
MB+SCP (IIT Bombay) CL202 January 25, 2022 52 / 57
Independence of Transformations
If random variables X , Y are independent, then the
random variables
Z = g (X ), U = h(Y )
are also independent.
Proof: Let Az denote the set of points on the x-axis such
that g (x) ≤ z and Bu denote the set of points on the
y-axis such that h(y ) ≤ u. Then,
{Z ≤ z} = {X ∈ Az }; {U ≤ u} = {Y ∈ Bu }
Thus, the events {Z ≤ z} and {U ≤ u} are independent
because events {X ∈ Az } and {Y ∈ Bu } are
independent.
MB+SCP (IIT Bombay) CL202 January 25, 2022 52 / 57
Expected Value
By analogy with transformation of a single RV, expected
value of a transformation of multiple RVs can be defined
as:
Z ∞Z ∞
E [g (X , Y )] = g (x, y )f (x, y )dxdy
−∞ −∞
For discrete RVs, the above becomes
XX
E [g (X , Y )] = g (x, y )p(x, y )
y x
MB+SCP (IIT Bombay) CL202 January 25, 2022 53 / 57
Special Cases
g (X , Y ) = X + Y . Then,
E (g (X , Y )) =
MB+SCP (IIT Bombay) CL202 January 25, 2022 54 / 57
Special Cases
g (X , Y ) = X + Y . Then,
E (g (X , Y )) = E [X ] + E [Y ]
MB+SCP (IIT Bombay) CL202 January 25, 2022 54 / 57
Special Cases
g (X , Y ) = X + Y . Then,
E (g (X , Y )) = E [X ] + E [Y ]
g (X , Y ) = (X − E [X ])(Y − E [Y ]): covariance of
X,Y; labeled Cov(X,Y).
MB+SCP (IIT Bombay) CL202 January 25, 2022 54 / 57
Special Cases
g (X , Y ) = X + Y . Then,
E (g (X , Y )) = E [X ] + E [Y ]
g (X , Y ) = (X − E [X ])(Y − E [Y ]): covariance of
X,Y; labeled Cov(X,Y).
Correlation coefficient: ρ
Cov(X , Y )
ρ=
σX σY
Property: ρ dimensionless, −1 ≤ ρ ≤ 1.
MB+SCP (IIT Bombay) CL202 January 25, 2022 54 / 57
Independence versus Covariance
If X , Y are independent, then
Cov(X , Y ) = 0
Independence =⇒ covariance=0 (variables
uncorrelated)
MB+SCP (IIT Bombay) CL202 January 25, 2022 55 / 57
Independence versus Covariance
If X , Y are independent, then
Cov(X , Y ) = 0
Independence =⇒ covariance=0 (variables
uncorrelated)
6
Covariance=0 =⇒ independence
MB+SCP (IIT Bombay) CL202 January 25, 2022 55 / 57
Independence versus Covariance
If X , Y are independent, then
Cov(X , Y ) = 0
Independence =⇒ covariance=0 (variables
uncorrelated)
Covariance=0 =⇒ 6 independence
Example: X,Y take values
(0, 1), (−1, 0), (0, −1), (1, 0) with equal probability
(1/4).
Cov(X,Y)=0, but X,Y not independent.
MB+SCP (IIT Bombay) CL202 January 25, 2022 55 / 57
Independence Implications
g (X , Y ) = XY
I If X , Y independent,
E [XY ] = E [X ]E [Y ]
MB+SCP (IIT Bombay) CL202 January 25, 2022 56 / 57
Independence Implications
g (X , Y ) = XY
I If X , Y independent,
E [XY ] = E [X ]E [Y ]
g (X , Y ) = h(X )l(Y )
I If X , Y independent,
E [h(X )l(Y )] = E [h(X )]E [l(Y )]
MB+SCP (IIT Bombay) CL202 January 25, 2022 56 / 57
THANK YOU
MB+SCP (IIT Bombay) CL202 January 25, 2022 57 / 57