KEMBAR78
Module 2 Class | PDF | Markov Chain | Probability Distribution
0% found this document useful (0 votes)
27 views71 pages

Module 2 Class

Class Notes by professor for Module 2 of BCS301 CSE

Uploaded by

BreadBeau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views71 pages

Module 2 Class

Class Notes by professor for Module 2 of BCS301 CSE

Uploaded by

BreadBeau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Module II : Joint probability distribution and

Markov Chain
Topic 1: Joint Probability distribution Two Discrete
Random Variables, Expectation, Covariance, and
Correlation in Joint Probability Distribution

Dr. P. Rajendra, Professor, Department of mathematics,


CMRIT, Bengaluru.
Introduction to Joint Probability Distribution
* A Joint Probability Distribution describes the probability of
two discrete random variables occurring simultaneously.
* If X and Y are two discrete random variables, the joint
probability distribution is denoted by P(X = x, Y = y ),
representing the probability that X = x and Y = y .
* The Joint Probability Mass Function (PMF) is a function
that gives the probability that each pair of outcomes occurs:

P(X = x, Y = y )

* The sum of all joint probabilities must equal 1:


XX
P(X = x, Y = y ) = 1
x y

* The joint PMF provides a complete description of the joint


distribution of X and Y .
Joint Probability Distribution Table
The joint probability distribution of two discrete random variables
can be represented in a table, with one variable on the rows and
the other on the columns. Each cell in the table represents
P(X = x, Y = y ) for the corresponding values of X and Y .
For example, X representing the number of times a user clicks on
ads per day, and Y representing the number of products purchased
per day. The joint PMF P(X = x, Y = y ) can describe the
probability of a user clicking on ads x times and purchasing y
products on a given day.For this example of ad clicks (X ) and
product purchases (Y ), the table might look like this:

X \Y 0 1 2
0 0.1 0.05 0.02
1 0.15 0.2 0.05
2 0.05 0.1 0.08

* The values in each cell represent the joint probabilities for


different combinations of X and Y .
Marginal Probability Distribution
* The Marginal Probability Distribution of a random variable
is obtained by summing the joint probabilities over all possible
values of the other random variable.
* The marginal probability of X is given by:
X
P(X = x) = P(X = x, Y = y )
y

* Similarly, the marginal probability of Y is:


X
P(Y = y ) = P(X = x, Y = y )
x
Independence of Two Random Variables

▶ Two random variables X and Y are independent if the


occurrence of one does not affect the probability of the
occurrence of the other:

P(X = x, Y = y ) = P(X = x) · P(Y = y )

▶ If this equality does not hold, the variables are dependent.

Example:
If the number of ad clicks and the number of products purchased
by a user are independent, then the probability that a user clicks
on ads 3 times and purchases 2 products is simply the product of
the individual probabilities: P(X = 3) · P(Y = 2).
Expectation of Two Discrete Random Variables
▶ The Expectation (or Expected Value) of a discrete random
variable is a measure of its central tendency.
▶ For two discrete random variables X and Y , their individual
expectations are given by:
X
E [X ] = x · P(X = x)
x
X
E [Y ] = y · P(Y = y )
y

▶ The joint expectation (or expected value) of the product XY


is given by:
XX
E [XY ] = x · y · P(X = x, Y = y )
x y
Covariance

▶ Covariance measures the joint variability of two random


variables. It indicates whether the variables tend to increase or
decrease together.
▶ The covariance between two random variables X and Y is
defined as:

Cov(X , Y ) = E [(X − E [X ])(Y − E [Y ])]

▶ This can also be expressed as:

Cov(X , Y ) = E [XY ] − E [X ]E [Y ]

▶ A positive covariance indicates that X and Y tend to increase


together, while a negative covariance indicates that one tends
to increase when the other decreases.
Correlation

▶ Correlation is a standardized measure of the linear


relationship between two random variables. It is a
dimensionless quantity that ranges from -1 to 1.
▶ The Pearson correlation coefficient ρXY between two
random variables X and Y is defined as:
Cov(X , Y )
ρXY =
σX σY
p p
where σX = Var(X ) and σY = Var(Y ) are the standard
deviations of X and Y , respectively.
▶ A correlation of ρXY = 1 indicates a perfect positive linear
relationship, ρXY = −1 indicates a perfect negative linear
relationship, and ρXY = 0 indicates no linear relationship.
Properties of Covariance and Correlation

▶ Covariance:
(i) Cov(X , X ) = Var(X ), the variance of X .
(ii) Cov(X , Y ) = Cov(Y , X ), symmetry property.
(iii) Cov(X , Y ) = 0 if X and Y are independent.

▶ Correlation:
(i) ρXY is dimensionless and provides a normalized measure of
linear dependence.
(ii) ρXY = 1 or ρXY = −1 indicates perfect linear dependence.
(iii) ρXY = 0 indicates no linear relationship, but X and Y may still
be dependent in a non-linear way.
Applications in AI and Data Science

▶ Naive Bayes Classifier : In AI, the Naive Bayes classifier


assumes that the features (random variables) are independent
given the class label, making use of joint probability
distributions to predict the class of new data points.
▶ Recommendation Systems: Joint probability distributions
can help in understanding the likelihood of a user interacting
with different items simultaneously, leading to better
recommendations.
▶ Market Basket Analysis: Joint probabilities can be used to
analyze the co-occurrence of products in transactions, which
is essential for association rule mining in data science.
▶ Network Traffic Analysis: Joint probability distributions of
different traffic parameters can help in detecting anomalies or
predicting network congestion.
Problem 1
The joint distribution of two random variables X and Y is as
follows:
X /Y −4 2 7
1 1/8 1/4 1/8
5 1/4 1/8 1/8
Compute the following:
(i) E (X ) and E (Y ) (ii) E (XY )
(iii) σX and σY (iv) ρ(X , Y )

Solution:
The marginal density of X is obtained by summing over the
probabilities of Y :

1 1 1 1
P(X = 1) = P(1, −4) + P(1, 2) + P(1, 7) = + + =
8 4 8 2
1 1 1 1
P(X = 5) = P(5, −4) + P(5, 2) + P(5, 7) = + + =
4 8 8 2
The marginal density of Y is obtained by summing over the
probabilities of X :
1 1 3
P(Y = −4) = P(1, −4) + P(5, −4) = + =
8 4 8
1 1 3
P(Y = 2) = P(1, 2) + P(5, 2) = + =
4 8 8
1 1 1
P(Y = 7) = P(1, 7) + P(5, 7) = + =
8 8 4
X 1 1
∴ E (X ) = x · P(X = x) = 1 · +5· =3
x
2 2
and,
X 3 3 1
E (Y ) = y · P(Y = y ) = −4 · +2· +7· =1
y
8 8 4
Also, XX
E (XY ) = x · y · P(X = x, Y = y )
x y

1 1 1
= (1 · (−4) · ) + (1 · 2 · ) + (1 · 7 · )
8 4 8
1 1 1 3
+(5 · (−4) · ) + (5 · 2 · ) + (5 · 7 · ) =
4 8 8 2

X
E (X 2 ) = x 2 · P(X = x)
x
1 1
= 12 · + 52 · = 13
2 2
σX2 = E (X 2 ) − (E (X ))2 = 13 − 32 = 4

σX = 4 = 2
X
E (Y 2 ) = y 2 · P(Y = y )
y

3 3 1
= (−4)2 · + 22 · + 72 · = 19.75
8 8 4
σY2 = E (Y 2 ) − (E (Y ))2 = 19.75 − 12 = 18.75

σY = 18.75 ≈ 4.33
Finally, we calculate the correlation coefficient ρ(X , Y ):

Cov (X , Y ) = E (XY ) − E (X )E (Y )

3 3
∴ Cov (X , Y ) = − (3 · 1) = −
2 2
ρ(X , Y ) = Cov (X , Y )/σX σY
− 32
∴ ρ(X , Y ) == ≈ −0.173
2 ∗ 4.33
Problem 2
Determine,
(i). Marginal distributions of X and Y.
(ii). Covariance between the discrete random variables X and
Y.
The joint probability distribution is given as:

X /Y 3 4 5
1 1 1
2 6 6 6
1 1 1
5 12 12 12
1 1 1
7 12 12 12
(i) Marginal Distributions: The marginal distribution of X is
calculated by summing over the probabilities of all values of Y for
each X.
1 1 1 3 1
P(X = 2) = + + = =
6 6 6 6 2
1 1 1 3 1
P(X = 5) = + + = =
12 12 12 12 4
1 1 1 3 1
P(X = 7) = + + = =
12 12 12 12 4
The marginal distribution of Y is calculated by summing over the
probabilities of all values of X for each Y.
1 1 1 4 1
P(Y = 3) = + + = =
6 12 12 12 3
1 1 1 4 1
P(Y = 4) = + + = =
6 12 12 12 3
1 1 1 4 1
P(Y = 5) = + + = =
6 12 12 12 3
(ii) The covariance Cov(X , Y ) is calculated as:

Cov(X , Y ) = E (XY ) − E (X )E (Y )
Where, XX
E (XY ) = xi yj P(X = xi , Y = yj )
i j
We compute:
1 1 1
E (X ) = 2 × + 5 × + 7 × = 2 × 0.5 + 5 × 0.25 + 7 × 0.25 = 4
2 4 4
1 1 1
E (Y ) = 3 × + 4 × + 5 × = 4
3 3 3
Next, calculate E (XY ):
1 1 1 1 1
E (XY ) = (2×3× )+(2×4× )+(2×5× )+(5×3× )+(5×4× )
6 6 6 12 12
1 1 1 1
+(5 × 5 × ) + (7 × 3 × ) + (7 × 4 × ) + (7 × 5 × ) = 16
12 12 12 12
Thus,
Cov(X , Y ) = E (XY ) − E (X )E (Y ) = 16 − 4 × 4 = 0
Problem 3
X and Y are independent random variables. X takes the values 2,
5, and 7 with probabilities 21 , 14 , and 41 , respectively. Y takes the
values 3, 4, and 5 with the probabilities 13 , 13 , and 31 , respectively.
(i). Find the Joint Probability Distribution of X and Y.
(ii). Show that COV(X, Y) = 0.

Solution (a): Since X and Y are independent random variables, the


joint probability distribution can be found as the product of their
marginal probabilities:

P(X = x, Y = y ) = P(X = x) · P(Y = y )


Joint Probability Table:

X \Y 3 4 5
1 1 1
2 6 6 6
1 1 1
5 12 12 12
1 1 1
7 12 12 12
Each entry in the table is computed by multiplying the marginal
probabilities, for example:
1 1 1
P(X = 2, Y = 3) = P(X = 2) × P(Y = 3) = × =
2 3 6
Solution (b): To show that COV (X , Y ) = 0, we first need to
calculate E (X ), E (Y ), and E (XY ).
1 1 1 5 7
E (X ) = 2 × +5× +7× =1+ + =4
2 4 4 4 4
1 1 1 3 4 5
E (Y ) = 3 × + 4 × + 5 × = + + = 4
3 3 3 3 3 3
Using the joint probability distribution, we can compute E (XY ):

1 1 1 1 1
E (XY ) = 2×3× +2×4× +2×5× +5×3× +5×4× +
6 6 6 12 12
1 1 1 1
5×5× +7×3× +7×4× +7×5× = 16
12 12 12 12
The covariance formula is:

COV (X , Y ) = E (XY ) − E (X )E (Y )

Substituting the values:

COV (X , Y ) = 16 − 4 × 4 = 16 − 16 = 0

Hence, COV (X , Y ) = 0.
Problem 4 Determine the value of k so that the function
f (x, y ) = k|x − y |, for x = −2, 0, 2 and y = −2, 3, represents the
joint probability distribution of the random variables X and Y .
Also, determine Cov(X , Y ).
Solution:
For f (x, y ) to represent a valid joint probability distribution, the
sum of all probabilities must be 1, i.e.,
X
f (x, y ) = 1
x,y

Given f (x, y ) = k|x − y |, we calculate for all pairs of x and y :

f (−2, −2) = k| − 2 − (−2)| = 0


f (−2, 3) = k| − 2 − 3| = 5k
f (0, −2) = k|0 − (−2)| = 2k
f (0, 3) = k|0 − 3| = 3k
f (2, −2) = k|2 − (−2)| = 4k
f (2, 3) = k|2 − 3| = k
Summing these values:

0 + 5k + 2k + 3k + 4k + k = 15k
P
Thus, to satisfy f (x, y ) = 1, we have:

1
15k = 1 =⇒ k =
15
The marginal density of X , fX (x), is found by summing over y :
X
fX (x) = f (x, y )
y

5 1
fX (−2) = f (−2, −2) + f (−2, 3) = 0 + = ,
15 3
2 3 1
fX (0) = f (0, −2) + f (0, 3) = + = ,
15 15 3
4 1 1
fX (2) = f (2, −2) + f (2, 3) = + = .
15 15 3
Similarly, the marginal density of Y , fY (y ), is found by summing
over x:
X
fY (y ) = f (x, y )
x

2 4 2
fY (−2) = f (−2, −2) + f (0, −2) + f (2, −2) = 0 + + = ,
15 15 5
5 3 1 3
fY (3) = f (−2, 3) + f (0, 3) + f (2, 3) = + + = .
15 15 15 5
Covariance is given by:

Cov(X , Y ) = E (XY ) − E (X )E (Y )

Expected value of X :
X 1 1
E (X ) = xf (x, y ) = (−2) × +0+2× =0
x,y
3 3
Expected value of Y :
6 9
E (Y ) = (−2) × +3× =1
15 15
Expected value of XY :
5 4 1 8
E (XY ) = (−2)(3) × + 2(−2) × + 2(3) × =−
15 15 15 3
Using the results from previous steps:

Cov(X , Y ) = E (XY ) − E (X )E (Y )

Substituting the values:


8 8
Cov(X , Y ) = − − (0)(1) = −
3 3
x+y
Problem 5 Given the joint probability distribution f (x, y ) = 30 ,
where x = 0, 1, 2, 3 and y = 0, 1, 2, find the following:
1. P[X ≤ 2, Y = 1]
2. P[X > 2, Y ≤ 1]
3. P[X > Y ]

x+y
Solution: The joint probability distribution f (x, y ) = 30 for
x = 0, 1, 2, 3 and y = 0, 1, 2 is shown below:

x\y 0 1 2
1 2
0 0 30 30
1 2 3
1 30 30 30
2 3 4
2 30 30 30
3 4 5
3 30 30 30
P[X ≤ 2, Y = 1] = f (0, 1) + f (1, 1) + f (2, 1)
From the table:
1 2 3 6 1
P[X ≤ 2, Y = 1] = + + = =
30 30 30 30 5

P[X > 2, Y ≤ 1] = f (3, 0) + f (3, 1)


From the table:
3 4 7
P[X > 2, Y ≤ 1] = + =
30 30 30
P[X > Y ] = f (1, 0) + f (2, 0) + f (2, 1) + f (3, 0) + f (3, 1) + f (3, 2)
From the table:
1 2 3 3 4 5 3
P[X > Y ] = + + + + + =
30 30 30 30 30 30 5
Assignment Problems
Problem 1:
The joint probability distribution of X and Y is as follows:
X \Y −3 2 4
1 0.1 0.2 0.2
2 0.3 0.1 0.1

i) Determine the marginal distributions of X and Y.


ii) Find the covariance between X and Y along with the
correlation.
Problem 2:
The joint probability distribution of X and Y is as follows:
X \Y 1 3 6
1 1 1
1 9 6 18
1 1 1
3 6 4 12
1 1 1
6 18 12 36

i) Find the marginal distributions of X and Y.


ii) Determine whether X and Y are statistically independent.
Stochastic Processes, Probability Vectors, and
Stochastic Matrices

Dr. P. Rajendra, Professor, CMRIT

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 1 / 17
Stochastic Processes

A Stochastic Process consists of a sequence of experiments in which


each experiment has a finite number of outcomes with given
probabilities.
The process models the evolution of a system over time, where the
outcome at each step is uncertain.

Example
A recommendation system might treat a user’s interactions as a stochastic
process. The system has probabilities associated with which product a user
will click on next, based on past interactions.

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 2 / 17
Probability Vectors

A Probability Vector is a vector V = [v1 , v2 , . . . , vn ], where each


component is non-negative and the sum of all components equals 1:
n
X
vi = 1 with 0 ≤ vi ≤ 1.
i=1

Each vi represents the probability of the system being in state i.

Examples:
 
1 1 1
V1 = [0.1, 0.6, 0.3], V2 = , ,
3 3 3

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 3 / 17
Stochastic Matrices
A Stochastic Matrix is a square matrix P in which all entries are
non-negative and the sum of the entries in each row equals 1. Each
row of the matrix represents a probability vector:
 
p11 p12 · · · p1n
p21 p22 · · · p2n 
P= .
 
.. .. .. 
 .. . . . 
pn1 pn2 · · · pnn

The matrix models transitions between different states in a stochastic


process. The entry pij represents the probability of transitioning from
state i to state j.

Example:
1 1
  
2 2 0 1
P= 2 1 or P= 1 1
3 3 2 2

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 4 / 17
Regular Stochastic Matrices

A Regular Stochastic Matrix is a stochastic matrix P where some


power of P, say P n , has only positive entries. This means that it is
possible to transition from any state to any other state in the system
after a certain number of steps.
Regular stochastic matrices guarantee the existence of a steady-state
probability vector Q, such that QP = Q.

Example:
If P is: 1 1

P= 2 2
1 3
4 4

Then, P 2 will have all positive entries, indicating that it is a regular


stochastic matrix.

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 5 / 17
Transition Matrices

A Transition Matrix, also known as a Stochastic Matrix, is used to


represent the transition probabilities of a system moving from one
state to another.
For an n-state system, the transition matrix is an n × n matrix where
the entry pij represents the probability of transitioning from state i to
state j.
Transition matrices are heavily used in Hidden Markov Models
(HMMs), which are popular in sequence prediction tasks in AI, such
as speech recognition or part-of-speech tagging.

Example:
 
0 1 0
P =  12 1
4
1
4
1 1 1
3 3 3

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 6 / 17
Problem 1 Given the matrices
   
1−a a 1−b b
P1 = , P2 =
b 1−b a 1−a
Show that: (i). P1 is a stochastic matrix. (ii). P2 is a stochastic matrix.
(iii). P1 P2 is a stochastic matrix.
Solution:
For P1 :
Row 1 sum = (1 − a) + a = 1
Row 2 sum = b + (1 − b) = 1
Since all entries are non-negative and each row sums to 1, P1 is a
stochastic matrix.
For P2 :
Row 1 sum = (1 − b) + b = 1
Row 2 sum = a + (1 − a) = 1
Since all entries are non-negative and each row sums to 1, P2 is a
stochastic matrix.
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 7 / 17
Now, calculate the product P1 P2 :
 
1−a a
P1 P2 =  b 1 − b1 − b b 
a 1−a
Perform the matrix multiplication:
 
(1 − a)(1 − b) + a · a (1 − a)b + a(1 − a)
P1 P2 =
b(1 − b) + (1 − b)a b · b + (1 − b)(1 − a)
Simplifying the terms:

1 − a − b + ab + a2 b − ab + a − a2
 
P1 P2 =
b − b 2 + a − ab b 2 + 1 − b − a + ab
To show P1 P2 is stochastic, we check the row sums:
Row 1: (1 − a − b + ab + a2 ) + (b − ab + a − a2 ) = 1
Row 2: (b − b 2 + a − ab) + (b 2 + 1 − b − a + ab) = 1
Thus, P1 P2 is a stochastic matrix.

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 8 / 17
Problem 2
Verify that the matrix  
0 0 1
P =  12 1
4
1
4
0 1 0
is a regular stochastic matrix.

Solution:
A matrix is stochastic if all of its entries are non-negative and the
sum of the entries in each row equals 1.
A stochastic matrix is regular if some power of the matrix has all
positive entries.
Row 1: 0 + 0 + 1 = 1,
1 1 1
Row 2: + + = 1,
2 4 4
Row 3: 0 + 1 + 0 = 1.
Since all the row sums are 1, P is a stochastic matrix.
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 9 / 17
     
0 0 1 0 0 1 0 1 0
P2 = P × P =  12 1
4
1
4×  21 1
4
1
4 =  81 5
16
9 
16 .
1 1 1
0 1 0 0 1 0 2 4 4

Since P2 contains some zero entries, we check P3 :


    1 1 1

0 1 0 0 0 1 2 4 4
P3 = P2 × P =  18 165 9 
16 × 2
1 1 1 =  5
4 4 32
41
64
13 
64 .
1 1 1 1 5 9
2 4 4 0 1 0 8 16 16

Since P3 contains all positive entries, P is a regular stochastic matrix.

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 10 / 17
Problem 3: Find the fixed probability vector for the regular stochastic
matrix 1 2
A = 31 33
4 4
Solution: Given: 1 2

A= 3 3
1 3
4 4
Since the matrix A is of second order, let the fixed probability vector
Q = [x y ], where x ≥ 0, y ≥ 0, and x + y = 1.
Now, 1 2
QA = [x y ] 31 33 = 31 x + 14 y 32 x + 34 y
 
4 4
Since QA = Q, we have:
1
+ 41 y 2
+ 34 y = x y
  
3x 3x

This gives us the following equations:


1 1 2 3
x+ y =x and x+ y =y
3 4 3 4
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 11 / 17
From the first equation:
2 1
x = y (1)
3 4
We also know x + y = 1, so y = 1 − x.
Substitute y = 1 − x into equation (1):
2 1
x = (1 − x)
3 4
Simplifying:
2 1 1
x+ x=
3 4 4
8x + 3x 1
=
12 4
3
11x = 3 ⇒ x=
11
Thus,
3 8
y =1−x =1−=
11 11
Therefore, the required fixed probability vector is:
3 8
Q = 11 11
Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 12 / 17
Problem 4:
Find the fixed probability vector for the regular stochastic matrix
 
0 1 0
P =  0 0 1
1 1
2 2 0

Solution: Fixed Probability Vector for the Regular Stochastic Matrix


Since the given matrix P is of order 3 × 3, Let the required fixed
probability vector is Q = [x y z] . For every x ≥ 0, y ≥ 0, z ≥ 0, and
x + y + z = 1, we have the following equation:

QP = Q

That is:  
0 1 0
z z
[x y z]  0 0 1 = [ x + y )]
1 1 2 2
2 2 0

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 13 / 17
This gives the system of equations:
z
=x (1)
2
1
x + (z) = y (2)
2
y = z (3)
We know that x + y + z = 1, therefore:
1
x + 2x + 2x = 1 =⇒ x =
5
This simplifies to:
1 2 2
x= , y= , z=
5 5 5
Thus, the required fixed probability vector is:

Q = 51 25 25
 

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 14 / 17
Problem 5: Find the fixed probability vector of the regular stochastic
matrix:
0 32 13
 

P =  12 0 21 
1 1
2 2 0
Solution:
Given the matrix:
2 1
 
0 3 3
P =  12 0 1
2
1 1
2 2 0
Since the given matrix P is of order 3 × 3, the required fixed
 
probability vector Q must also be of order 1 × 3. Let Q = x y z
where x ≥ 0, y ≥ 0, z ≥ 0, and x + y + z = 1.Also, QP = Q.
2 1
 
0 3 3
QP = x y z  12 0 21 
 
1 1
2 2 0

QP = 12 y + 12 z 23 x + 12 z 13 x + 12 y
 

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 15 / 17
We know that QP = Q, so:
1 1 2 1 1
+ 12 y = x y z
  
2y + 2z 3x + 2z 3x

From the equation QP = Q, we obtain the following system of equations:


1 1 2 1 1 1
x = y + z, y = x + z, z = x + y
2 2 3 2 3 2
Additionally, we have the constraint:
x +y +z =1
Solving this system of equations, we get the following relationships:
3x − 1 = 0, x − 9y = −3, 8x + 9y = 6
Solving these, we get:
1 10 8
x= , y= , z=
3 27 27
Thus, the fixed probability vector Q is:
Q = 31 10 8
 
27 27

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 16 / 17
Assignment Problems

1. Find the fixed probability vector of the regular stochastic matrix:


1 1 1
2 4 4
P =  12 0 1
2
0 1 0

2. Find the fixed probability vector of the regular stochastic matrix:


 
0 1 0
P =  61 12 13 
0 23 13

Dr. P. Rajendra, Professor, CMRIT Stochastic Processes, Probability Vectors, and Stochastic Matrices 17 / 17
Markov Chains and Transition Probabilities

Dr.P.Rajendra, Professor, CMRIT

Dr.P.Rajendra, Professor, CMRIT Markov Chains and Transition Probabilities 1/7


Markov Chains
A Markov Chain is a stochastic model that describes a sequence of
events where the probability of each event depends only on the state
of the previous event.
The system is modeled as transitioning between a finite number of
states.

Dr.P.Rajendra, Professor, CMRIT Markov Chains and Transition Probabilities 2/7


Higher Transition Probabilities
Let P be an n × n transition probability matrix of a Markov chain,
where pij represents the probability of moving from state ai to state
aj . This is called a 1-step transition. To compute the probability of
transitioning from state ai to state aj in n steps, we denote the n-step
(n)
transition probability by pij . The matrix formed by these
probabilities is called the n-step transition matrix, denoted by P n .
Pn = P × P × · · · × P (n times)
The initial probability distribution of the system is represented by the
(0) (0) (0)
vector p (0) = [p1 , p2 , . . . , pn ], and the probability distribution
after n steps is given by:
p (n) = p (0) P n

If P is the transition matrix and p (0) is the initial distribution:

p (1) = p (0) P, p (2) = p (1) P = p (0) P 2 , ..., p (n) = p (0) P n


Dr.P.Rajendra, Professor, CMRIT Markov Chains and Transition Probabilities 3/7
Stationary Distribution of Regular Markov Chains

A Regular Markov Chain is one where, after a certain number of


steps, it is possible to transition from any state to any other state
with a positive probability.
A Stationary Distribution is a probability distribution
π = [π1 , π2 , . . . , πn ] that remains unchanged as the system evolves
over time:
πP = π
The stationary distribution represents the long-term behavior of the
Markov Chain. In AI, this is useful for determining the steady-state
probabilities of a system. For example, in Google’s PageRank
algorithm, the stationary distribution of the Markov chain is used to
rank web pages based on their importance.

Dr.P.Rajendra, Professor, CMRIT Markov Chains and Transition Probabilities 4/7


Absorbing States

An Absorbing State is a state in a Markov Chain where, once


entered, there is no possibility of leaving. In other words, if the system
transitions into an absorbing state, it will remain there indefinitely.
A state ai is an absorbing state if pii = 1 and pij = 0 for all j ̸= i.
Absorbing Markov chains are used in modeling systems where certain
states represent a final outcome or irreversible decision.

Example:
Consider a Markov chain with the following transition matrix:
 
1 0 0
P = 0.4 0.6
 0
0 0 1

Here, states 1 and 3 are absorbing states, since p11 = 1 and p33 = 1.

Dr.P.Rajendra, Professor, CMRIT Markov Chains and Transition Probabilities 5/7


Problem 1

Prove that the Markov Chain with Transition matrix


0 32 13
 

A =  12 0 21 
1 1
2 2 0

is irreducible.
Solution:
Given matrix A is a stochastic matrix, being a transition matrix.
This means that:
All the elements of the matrix are non-negative.
The sum of the elements in each row is equal to 1.
We now calculate A2 to check for the irreducibility of the Markov
Chain.

Dr.P.Rajendra, Professor, CMRIT Markov Chains and Transition Probabilities 6/7


2 1
 
0 3 3
A =  12 0 1
2
1 1
2 2 0
2 1 2 1
   
0 3 3 0 3 3
A2 = A × A =  12 0 1
2 ×  12 0 1
2
1 1 1 1
2 2 0 2 2 0
1 1 1

2 6 3
2 1 7 1 
A = 4 12 6
1 1 5
4 3 12

All the entries in A2 are non-negative, and the sum of each row
equals 1.
Hence, the matrix A is a regular stochastic matrix.
Since the matrix A is regular, it follows that the given Markov Chain
is irreducible.

Dr.P.Rajendra, Professor, CMRIT Markov Chains and Transition Probabilities 7/7


Topic 4 : Problems on Markov Chains

Dr. P. Rajendra, Professor, CMRIT

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 1 / 20


Problem 1

The transition matrix P of a Markov chain is given by:


1 1
P = 23 12
4 4

1 3
with initial probability distribution p (0) =

4 4 .
Define and find the following:
(2)
(i) p21
(2)
(ii) p12
(iii) p (2)
(2)
(iv) p1
(v) The vector p (0) P n as n → ∞
(vi) The matrix P n as n → ∞

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 2 / 20


Solution:
The given transition matrix is:
1 1

P= 2 2
3 1
4 4
We first compute the square of the matrix P 2 :
!
1 1 1 1  5 3
 (2) (2)
2 p11 p12
P = P · P = 23 21 2
3
2
1 = 89 8
7 = (2) (2)
4 4 4 4 16 16 p21 p22
From P 2 , we have:
9 (2) (2) 3
,p21 = p12 =
16 8
The initial probability distribution is:
p (0) = 1 3

4 4
Now, compute p(2) = p(0)P 2 :
5 3
 
(2) 1 3 37 27
 
p = 8 8 =
4 4 9 7 64 64
16 16
Dr. P. Rajendra, Professor,
37CMRIT
27 Topic 4 : Problems on Markov Chains 3 / 20

From the result of p (2) , we have:
(2) 37
p1 =
64

We now find the steady-state (long-run) probability vector Q = x y ,
x+y=1 and such that:
QP = Q
1 1

 
=⇒ x y 2 2 = x y
3 1
4 4

x 3y x y
 
=⇒ 2 + 4 2 + 4
= x y

x 3y x y 3 2
∴ + = x, + = y gives x = , y=
2 4 2 4 5 5
3 2

Q= 5 5
3 2

n 5 5
P → 3 2
5 5
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 4 / 20
Problem 2

The transition probability matrix (t.p.m.) of a Markov chain is given by:


1 1

2 0 2
P = 1 0 0
1 1 1
4 2 4

1 1
and the initial probability distribution is p (0) =

2 2 0 .
Find:
(2)
i p13
(2)
ii p23
iii p (2)
(2)
iv p1

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 5 / 20


Solution:

The transition matrix of the Markov chain is:


1 1

2 0 2
P = 1 0 0
1 1 1
4 2 4

The initial probability distribution is:

p (0) = 1 1

2 2 0

To find the two-step transition matrix P 2 , we compute:


1 1
 1 1
  3 1 3   (2) (2) (2)

2 0 2 2 0 2 8 4 8
p11 p12 p 13
 (2) (2) (2) 
P 2 =  1 0 0   1 0 0  =  12 0 12  = p21 p22 p23 
1 1 1 1 1 1 11 1 3 (2) (2) (2)
4 2 4 4 2 4 16 8 16 p31 p32 p33

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 6 / 20


From the two-step transition matrix P 2 , we have:

(2) 3 (2) 1
p13 = , p23 =
8 2
Now, we compute the probability distribution after two steps,
p (2) = p (0) P 2 :
3 1 3
8 4 8
p (2) = 12 21 0  12 0 12  = 16 7 1 7
 
8 16
11 1 3
16 8 16

7 1 7
Thus, p (2) = . From the result of p (2) , we have:

16 8 16

(2) 7
p1 =
16

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 7 / 20


Problem 3
A student’s study habits are as follows:
If he studies one night, he is 30% sure to study the next night.
If he does not study one night, he is 40% sure to study the next night.
Find the transition matrix for the chain of his study habits.
Solution: We define two possible states:
a1 = Studying, a2 = Not studying
From the problem, we have the following transition probabilities:
p11 = Probability of studying the next night, given that he studied
the previous night = 30% = 0.3
p12 = Probability of not studying the next night, given that he
studied the previous night = 70% = 0.7
p21 = Probability of studying the next night, given that he did not
study the previous night = 40% = 0.4
p22 = Probability of not studying the next night, given that he did
not study the previous night = 60% = 0.6
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 8 / 20
Therefore, the transition matrix of the Markov chain is:
   
p11 p12 0.3 0.7
P= =
p21 p22 0.4 0.6

Let the unique probability vector be Q = x y such that: QP = Q
 
 0.3 0.7 
=⇒ x y = x y
0.4 0.6
 
=⇒ 0.3x + 0.4y 0.7x + 0.6y = x y ∴ 0.3x+0.4y = x, 0.7x+0.6y =
The first equation becomes:
7 4
0.3x + 0.4y = x ⇒ 0.7x = 0.4y ⇒
x = y ⇒ 7x = 4y
10 10
4
Substitute y = 1 − x into the equation: 7x = 4(1 - x) ⇒ x = 11
4 7 4 7

Thus, y = 1 − x = 1 −
= . Q = 11 11
11 11
4
This means the student will study 11 or approximately 36.36% of the time
in the long run.
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 9 / 20
Problem 4
A software engineer goes to work every day by either motorbike or car: He
never uses a bike on two consecutive days. If he goes by car on a day, he is
equally likely to go by car or bike the next day. We are asked to find the
transition matrix for the chain of the mode of transport. Given that the
engineer uses a car on the first day of the week, find the probability
that:(i) He uses a bike on the fifth day.(ii) He uses a car on the fifth day.
Solution: We define two possible
states:a1 = Using bike, a2 = Using car. From the problem, we have the
following transition probabilities:
p11 = Probability of using bike on a day, given that bike was used on
the previous day = 0
p12 = Probability of using car on a day, given that bike was used on
the previous day = 1
p21 = Probability of using bike on a day, given that car was used on
the previous day = 12
p22 = Probability of using car on a day, given that car was used on
the previous day = 12
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 10 / 20
 
0 1
Therefore, the transition matrix P is given by: P = 1 1
2 2
The initial probability distribution is given by: p(0) = 0 1
Since the car is used on the first day. We calculate the square of the
transition matrix:
    1 1
2 0 1 0 1
P = P · P = 1 1 · 1 1 = 21 23
2 2 2 2 4 4

Next, we calculate P4
by squaring P 2:
1 1 1 1
  3 5

P 4 = P 2 · P 2 = 12 23 · 21 2
3 = 8
5
8
11
4 4 4 4 16 16
Now, we calculate the probability distribution on the fourth day:
3 5

p (4) = p (0) · P 4 = 0 1 · 85 11 5 11
 
8 = 16 16
16 16
Therefore, on the fifth day:
(4) 5
The probability that the engineer uses a bike is p1 = 16 .
(4) 11
The probability that the engineer uses a car is p2 = 16 .
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 11 / 20
Problem 5

Every year, a man trades his car for a new car. If he has a Maruti, he
trades it for an Ambassador. If he has an Ambassador, he trades it for a
Santro. However, if he has a Santro, he is just as likely to trade it for a
new Santro as to trade it for a Maruti or an Ambassador. In 2020, he
bought his first car, which was a Santro.
(i) Find the probability that he has:
(a) A Santro in 2022,
(b) A Maruti in 2022,
(c) An Ambassador in 2023,
(d) A Santro in 2023.
(ii) In the long run, how often will he have a Santro?

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 12 / 20


Solution:
Define the states as:
a1 : The state of having a Maruti,
a2 : The state of having an Ambassador,
a3 : The state of having a Santro.
The transition matrix P is given by:
 
0 1 0
P = 0 0 1
1 1 1
3 3 3
Initially (in 2000), he has a Santro, so the initial state vector is:
p (0) = (0, 0, 1)
To reach 2002 (2 steps later), we compute the 2-step transition matrix P 2 :
    
0 1 0 0 1 0 0 0 1
P 2 = P × P =  0 0 1   0 0 1  =  13 13 13 
1 1 1 1 1 1 1 4 4
3 3 3 3 3 3 9 9 9

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 13 / 20


Now, the probability vector in 2002 is:
 
0 0 1  
1 4 4
p (2) = p (0) × P 2 = (0, 0, 1)  31 1
3
1
3 = , ,
1 4 4 9 9 9
9 9 9

From the result of p (2) :


(a) The probability of having a Santro in 2002 is 94 ,
(b) The probability of having a Maruti in 2002 is 19 .
To reach 2003 (3 steps later), we compute:
 
  0 1 0  
1 4 4 4 7 16
p (3) = p (2) × P = , , × 0 0 1 = , ,
9 9 9 1 1 1 27 27 27
3 3 3

From the result of p (3) :


7
(c) The probability of having an Ambassador in 2003 is 27 ,
16
(d) The probability of having a Santro in 2003 is 27 .
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 14 / 20
To find the long-run (steady-state) probabilities, we solve for the
stationary distribution t = (x, y , z) such that tP = t.
 
0 1 0
(x, y , z)  0 0 1  = (x, y , z)
1 1 1
3 3 3
This gives the system of equations:
z z z
= x, x + = y , y + = z
3 3 3
We know that x + y + z = 1 This gives the system of equations:
1 − x − y = 3x, 3x + (1 − x − y ) = 3y , 3y + (1 − x − y ) = 3(1 − x − y )
Solving, we get:
1 1 1
x= , y= , z=
6 3 2
Thus, in the long run, the man will own:
A Maruti 61 of the time,
An Ambassador 13 of the time,
A Santro 12 of the time.
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 15 / 20
Problem 6
A Salesman’s territory consists of three cities A, B, and C: He never sells
in the same city on successive days. If he sells in city A, then the next day
he sells in city B. If he sells in city B or C, the next day he is twice as likely
to sell in city A than in other cities. We are asked to determine how often,
in the long run, he sells in each of the cities.
Solution: From the problem, we have the following transition probabilities:
 
0 1 0
P =  23 0 31 
2 1
3 3 0

Let Q = x y z represent the steady-state probability vector, where
x + y + z = 1.
In steady-state, we have Q · P = Q, which gives:
 
 02 1 01 
x y z 3 0 3 = x y z
2 1
3 3 0
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 16 / 20
This yields the following system of equations:
2 2 1 1
y + z = x, x + z = y , y = z.
3 3 3 3
We simplify the system of equations:
3x − 2y − 2z = 0, 3x − 3y + z = 0.
Substituting z = 31 y into the equations:
3x − 2y − 2(1 − x − y ) = 0, 3x − 3y + (1 − x − y ) = 0.
Solving these equations, we get:
5x = 2, 2x − 4y = −1.
Thus, we find:
2 9 3
x = , y = , z =1−x −y = .
5 20 20
Therefore, the steady-state probability vector is:
Q = 25 20 9 3

20
Thus, in the long run, the salesman sells: 40% of the time in city A, 45%
of the time in city B, 15% of the time in city C.
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 17 / 20
Problem 7
Three boys A, B, and C are throwing a ball to each other:
”A” always throws the ball to ”B”.
”B” always throws the ball to ”C”.
”C” is equally likely to throw the ball to ”A” or ”B”.
If ”C” was the first person to throw the ball, find the probabilities that
after three throws:
1 A has the ball.
2 B has the ball.
3 C has the ball.
Solution: The transition probability matrix P of the Markov chain is
given by:  
0 1 0
P =  0 0 1
1 1
2 2 0

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 18 / 20


We compute P 2 , the transition matrix after two throws:
     
0 1 0 0 1 0 0 0 1
2
P = P · P = 0 0 1 · 0 0 1 = 12
     1
2 0
1 1 1 1 1 1
2 2 0 2 2 0 0 2 2
Next, we compute P 3 , the transition matrix after three throws:
    1 1 
0 0 1 0 1 0 2 2 0
P 3 = P 2 · P =  12 12 0  ·  0 0 1 =  0 12 21 
1 1
0 12 12 2 2 0
1
4
1
4
1
2

The initial probability vector is given by: p(0) = 0 0 1




since ”C” initially has the ball. The probability distribution after three
throws is given by:
1 1 
2 2 0
p (3) = p (0) · P 3 = 0 0 1 ·  0 12 12  = 41 14 12
 
1 1 1
4 4 2
Therefore, after three throws, the probabilities are:
(3) 1 (3) 1 (3) 1
pA = , pB = , pC =
4 4 2
Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 19 / 20
Assignment Problems

(1). A man’s smoking habits are as follows: If he smokes filter cigarettes


one week, he switches to non-filter cigarettes the next week with the
probability of 0.2. If he smokes non-filter cigarettes one week, there is a
probability of 0.7 that he will smoke non-filter cigarettes the next week as
well. In the long run, how often does he smoke filter cigarettes?

(2). A gambler’s luck follows this pattern: If he wins a game, the


probability of winning the next game is 0.6. If he loses a game, the
probability of losing the next game is 0.7. It is given that there is an even
chance of the gambler winning the first game.
1 What is the probability of winning the second game?
2 What is the probability of winning the third game?
3 In the long run, how often will he win?

Dr. P. Rajendra, Professor, CMRIT Topic 4 : Problems on Markov Chains 20 / 20

You might also like