KEMBAR78
Module 2 | PDF | Probability Distribution | Random Variable
0% found this document useful (0 votes)
4 views208 pages

Module 2

The document provides an overview of probability theory, defining probability as the measure of the chance that a particular event will occur. It explains key concepts such as experiments, sample spaces, events, and the intersection and union of events, along with basic counting principles. The document emphasizes the importance of set theory in establishing a rigorous foundation for understanding probability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views208 pages

Module 2

The document provides an overview of probability theory, defining probability as the measure of the chance that a particular event will occur. It explains key concepts such as experiments, sample spaces, events, and the intersection and union of events, along with basic counting principles. The document emphasizes the importance of set theory in establishing a rigorous foundation for understanding probability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 208

BMAT202L: Probability and Statistics

Mohit Kumar

VIT Chennai
Basics of Probability Theory
Probability

What is Probability?
▶ Probability is a branch of mathematics that deals with the study
of random events or experiments.
Probability

What is Probability?
▶ Probability is a branch of mathematics that deals with the study
of random events or experiments.
▶ It is the measure of chance that a particular event will occur.
Probability

What is Probability?
▶ Probability is a branch of mathematics that deals with the study
of random events or experiments.
▶ It is the measure of chance that a particular event will occur.
▶ The probability of an event can be calculated by dividing the
number of favourable outcomes by the total number of possible
outcomes.
Probability

What is Probability?
▶ Probability is a branch of mathematics that deals with the study
of random events or experiments.
▶ It is the measure of chance that a particular event will occur.
▶ The probability of an event can be calculated by dividing the
number of favourable outcomes by the total number of possible
outcomes.
Probability

What is Probability?
▶ Probability is a branch of mathematics that deals with the study
of random events or experiments.
▶ It is the measure of chance that a particular event will occur.
▶ The probability of an event can be calculated by dividing the
number of favourable outcomes by the total number of possible
outcomes.

How can we define it?


▶ In order to define probability mathematically, we need to set
appropriate context.
Probability

What is Probability?
▶ Probability is a branch of mathematics that deals with the study
of random events or experiments.
▶ It is the measure of chance that a particular event will occur.
▶ The probability of an event can be calculated by dividing the
number of favourable outcomes by the total number of possible
outcomes.

How can we define it?


▶ In order to define probability mathematically, we need to set
appropriate context.
▶ We will do so, by defining the elements of probability.
Elements of Probability

Experiment
An experiment is a systematic and controlled process or activity carried out
to gather data or information about a particular phenomenon or system.
Elements of Probability

Experiment
An experiment is a systematic and controlled process or activity carried out
to gather data or information about a particular phenomenon or system.
Deterministic Experiment
▶ An experiment or a process in which the outcome can be predicted
with certainty before it is actually observed or performed.

Random Experiment
Elements of Probability

Experiment
An experiment is a systematic and controlled process or activity carried out
to gather data or information about a particular phenomenon or system.
Deterministic Experiment
▶ An experiment or a process in which the outcome can be predicted
with certainty before it is actually observed or performed.
▶ For instance, if we add 5 and 3, we know the outcome will always be 8,
and there is no randomness involved.
Random Experiment
Elements of Probability

Experiment
An experiment is a systematic and controlled process or activity carried out
to gather data or information about a particular phenomenon or system.
Deterministic Experiment
▶ An experiment or a process in which the outcome can be predicted
with certainty before it is actually observed or performed.
▶ For instance, if we add 5 and 3, we know the outcome will always be 8,
and there is no randomness involved.
Random Experiment
▶ An experiment or a process in which the outcome cannot be predicted
with certainty.
Elements of Probability

Experiment
An experiment is a systematic and controlled process or activity carried out
to gather data or information about a particular phenomenon or system.
Deterministic Experiment
▶ An experiment or a process in which the outcome can be predicted
with certainty before it is actually observed or performed.
▶ For instance, if we add 5 and 3, we know the outcome will always be 8,
and there is no randomness involved.
Random Experiment
▶ An experiment or a process in which the outcome cannot be predicted
with certainty.
▶ For example, tossing a coin is a random experiment because we cannot
predict the outcome of the coin toss with certainty.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
▶ For example, the sample space of a coin toss is S = {H, T}, where H
represents the outcome of getting heads and T represents the outcome
of getting tails.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
▶ For example, the sample space of a coin toss is S = {H, T}, where H
represents the outcome of getting heads and T represents the outcome
of getting tails.
▶ Each outcome in a sample space is called an element or a member of
the sample space, or simply a sample point.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
▶ For example, the sample space of a coin toss is S = {H, T}, where H
represents the outcome of getting heads and T represents the outcome
of getting tails.
▶ Each outcome in a sample space is called an element or a member of
the sample space, or simply a sample point.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
▶ For example, the sample space of a coin toss is S = {H, T}, where H
represents the outcome of getting heads and T represents the outcome
of getting tails.
▶ Each outcome in a sample space is called an element or a member of
the sample space, or simply a sample point.
Event
▶ An event is a subset of a sample space.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
▶ For example, the sample space of a coin toss is S = {H, T}, where H
represents the outcome of getting heads and T represents the outcome
of getting tails.
▶ Each outcome in a sample space is called an element or a member of
the sample space, or simply a sample point.
Event
▶ An event is a subset of a sample space.
▶ For example, the event of getting heads in a coin toss is {H}.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
▶ For example, the sample space of a coin toss is S = {H, T}, where H
represents the outcome of getting heads and T represents the outcome
of getting tails.
▶ Each outcome in a sample space is called an element or a member of
the sample space, or simply a sample point.
Event
▶ An event is a subset of a sample space.
▶ For example, the event of getting heads in a coin toss is {H}.
Elements of Probability
Sample Space
▶ The set of all possible outcomes of a random experiment is called the
sample space and is represented by the symbol S.
▶ For example, the sample space of a coin toss is S = {H, T}, where H
represents the outcome of getting heads and T represents the outcome
of getting tails.
▶ Each outcome in a sample space is called an element or a member of
the sample space, or simply a sample point.
Event
▶ An event is a subset of a sample space.
▶ For example, the event of getting heads in a coin toss is {H}.
Equally Likely Events
▶ Equally likely events are events that have the same theoretical
probability (or likelihood) of occurring.
Elements of Probability

Intuitive Definition of Probability


▶ Intuitively, probability of an event A is the likelihood of the event A in
S, so it can be defined as,

#A
P(A) = ,
#S

where #A denote the number of elements in A and P(A) be the


probability of A.
Elements of Probability

Intuitive Definition of Probability


▶ Intuitively, probability of an event A is the likelihood of the event A in
S, so it can be defined as,

#A
P(A) = ,
#S

where #A denote the number of elements in A and P(A) be the


probability of A.
▶ For example, the probability of getting heads in a fair coin toss is 1/2.
Elements of Probability

Intuitive Definition of Probability


▶ Intuitively, probability of an event A is the likelihood of the event A in
S, so it can be defined as,

#A
P(A) = ,
#S

where #A denote the number of elements in A and P(A) be the


probability of A.
▶ For example, the probability of getting heads in a fair coin toss is 1/2.
Elements of Probability

Intuitive Definition of Probability


▶ Intuitively, probability of an event A is the likelihood of the event A in
S, so it can be defined as,

#A
P(A) = ,
#S

where #A denote the number of elements in A and P(A) be the


probability of A.
▶ For example, the probability of getting heads in a fair coin toss is 1/2.

Is this definition okay?


▶ The elements mentioned above form the basis of probability, but in
order to define it rigorously we need to develop strong foundation
using set theory and counting principles.
Set Theory

Intersection of Two Events


▶ The intersection of two events A and B, denoted by the symbol
A ∩ B, is the event containing all elements that are common to A
and B.
Set Theory

Intersection of Two Events


▶ The intersection of two events A and B, denoted by the symbol
A ∩ B, is the event containing all elements that are common to A
and B.
▶ For example, if A = {1, 2, 3} and B = {3, 4, 5}, then
A ∩ B = {3}.
Set Theory

Intersection of Two Events


▶ The intersection of two events A and B, denoted by the symbol
A ∩ B, is the event containing all elements that are common to A
and B.
▶ For example, if A = {1, 2, 3} and B = {3, 4, 5}, then
A ∩ B = {3}.
Set Theory

Intersection of Two Events


▶ The intersection of two events A and B, denoted by the symbol
A ∩ B, is the event containing all elements that are common to A
and B.
▶ For example, if A = {1, 2, 3} and B = {3, 4, 5}, then
A ∩ B = {3}.

Union of Two Events


▶ The union of the two events A and B, denoted by the symbol
A ∪ B, is the event containing all the elements that belong to A or
B or both.
Set Theory

Intersection of Two Events


▶ The intersection of two events A and B, denoted by the symbol
A ∩ B, is the event containing all elements that are common to A
and B.
▶ For example, if A = {1, 2, 3} and B = {3, 4, 5}, then
A ∩ B = {3}.

Union of Two Events


▶ The union of the two events A and B, denoted by the symbol
A ∪ B, is the event containing all the elements that belong to A or
B or both.
▶ For example, if A = {1, 2, 3} and B = {3, 4, 5}, then
A ∪ B = {1, 2, 3, 4, 5}.
Set Theory

Complement of an Event
▶ The complement of an event A with respect to S is the subset of
all elements of S that are not in A.
Set Theory

Complement of an Event
▶ The complement of an event A with respect to S is the subset of
all elements of S that are not in A.
▶ We denote the complement of A by the symbol Ac .
Set Theory

Complement of an Event
▶ The complement of an event A with respect to S is the subset of
all elements of S that are not in A.
▶ We denote the complement of A by the symbol Ac .
▶ For example, if A = {1, 2, 3} and S = {1, 2, 3, 4, 5, 6}, then
Ac = {4, 5, 6}.
Set Theory

Complement of an Event
▶ The complement of an event A with respect to S is the subset of
all elements of S that are not in A.
▶ We denote the complement of A by the symbol Ac .
▶ For example, if A = {1, 2, 3} and S = {1, 2, 3, 4, 5, 6}, then
Ac = {4, 5, 6}.
Set Theory

Complement of an Event
▶ The complement of an event A with respect to S is the subset of
all elements of S that are not in A.
▶ We denote the complement of A by the symbol Ac .
▶ For example, if A = {1, 2, 3} and S = {1, 2, 3, 4, 5, 6}, then
Ac = {4, 5, 6}.

Mutually Exclusive or Disjoint Events


Two events A and B are mutually exclusive, or disjoint, if

A ∩ B = ∅,

that is, if A and B have no elements in common.


Counting Principles

Rule 1: Multiplication Rule


▶ If an operation can be performed in n1 ways, and if for each of
these ways a second operation can be performed in n2 ways, then
the two operations can be performed together in n1 × n2 ways.
Counting Principles

Rule 1: Multiplication Rule


▶ If an operation can be performed in n1 ways, and if for each of
these ways a second operation can be performed in n2 ways, then
the two operations can be performed together in n1 × n2 ways.
▶ Example: Suppose you want to choose a fruit and a drink for
breakfast, and you have 2 fruit options (apple and orange) and 3
drink options (coffee, milk, and tea). How many different
breakfast combinations can you create?
Counting Principles

Rule 1: Multiplication Rule


▶ If an operation can be performed in n1 ways, and if for each of
these ways a second operation can be performed in n2 ways, then
the two operations can be performed together in n1 × n2 ways.
▶ Example: Suppose you want to choose a fruit and a drink for
breakfast, and you have 2 fruit options (apple and orange) and 3
drink options (coffee, milk, and tea). How many different
breakfast combinations can you create?
▶ The total number of breakfast combinations is 2 × 3 = 6.
Counting Principles

Rule 2: Generalized Multiplication Rule


▶ If an operation can be performed in n1 ways, and if for each of
these a second operation can be performed in n2 ways, and for
each of the first two a third operation can be performed in n3
ways, and so forth, then the sequence of k operations can be
performed in n1 × n2 × . . . × nk ways.
Counting Principles

Rule 2: Generalized Multiplication Rule


▶ If an operation can be performed in n1 ways, and if for each of
these a second operation can be performed in n2 ways, and for
each of the first two a third operation can be performed in n3
ways, and so forth, then the sequence of k operations can be
performed in n1 × n2 × . . . × nk ways.
▶ Example: A lock has 4 digits, each of which can be any number
from 0 to 9 (inclusive). If you are trying to guess the
combination, how many possible combinations are there?
Counting Principles

Rule 2: Generalized Multiplication Rule


▶ If an operation can be performed in n1 ways, and if for each of
these a second operation can be performed in n2 ways, and for
each of the first two a third operation can be performed in n3
ways, and so forth, then the sequence of k operations can be
performed in n1 × n2 × . . . × nk ways.
▶ Example: A lock has 4 digits, each of which can be any number
from 0 to 9 (inclusive). If you are trying to guess the
combination, how many possible combinations are there?
▶ The total number of possible combinations is
10 × 10 × 10 × 10 = 10, 000.
Counting Principles

Permutation
▶ A permutation is an arrangement of a set of objects in a specific
order.
Counting Principles

Permutation
▶ A permutation is an arrangement of a set of objects in a specific
order.
▶ In other words, it is a way of selecting and arranging a subset of
elements from a larger set, where the order of selection matters.
Counting Principles

Permutation
▶ A permutation is an arrangement of a set of objects in a specific
order.
▶ In other words, it is a way of selecting and arranging a subset of
elements from a larger set, where the order of selection matters.
▶ The total number of permutations of a set of n distinct objects
can be calculated using the formula n! (pronounced as “n
factorial”), where n! = n × (n − 1) × (n − 2) × · · · × 2 × 1, with
special case 0! = 1.
Counting Principles

Permutation
▶ A permutation is an arrangement of a set of objects in a specific
order.
▶ In other words, it is a way of selecting and arranging a subset of
elements from a larger set, where the order of selection matters.
▶ The total number of permutations of a set of n distinct objects
can be calculated using the formula n! (pronounced as “n
factorial”), where n! = n × (n − 1) × (n − 2) × · · · × 2 × 1, with
special case 0! = 1.
▶ The number of permutations of n objects arranged in a circle is
(n − 1)!.
Counting Principles
▶ The number of permutations of n distinct objects taken r at a
time is
n n!
Pr = .
(n − r)!
Counting Principles
▶ The number of permutations of n distinct objects taken r at a
time is
n n!
Pr = .
(n − r)!
▶ For example, the number of ways to select a committee of 3
members from a group of 6 people is

6 6!
P3 = = 6 × 5 × 4 = 120.
(6 − 3)!
Counting Principles
▶ The number of permutations of n distinct objects taken r at a
time is
n n!
Pr = .
(n − r)!
▶ For example, the number of ways to select a committee of 3
members from a group of 6 people is

6 6!
P3 = = 6 × 5 × 4 = 120.
(6 − 3)!

▶ The number of distinct permutations of n things of which n1 are


of one kind, n2 of a second kind, . . . , nk of a kth kind is

n!
.
n1 !n2 ! . . . nk !
Counting Principles

Combination
▶ A combination is a way of selecting a subset of objects from a
larger set, where the order of the selection does not matter.
Counting Principles

Combination
▶ A combination is a way of selecting a subset of objects from a
larger set, where the order of the selection does not matter.
▶ The number of combinations of k objects chosen from a set of n
distinct objects is
 
n n n!
Ck = = .
k k!(n − k)!
Counting Principles

Combination
▶ A combination is a way of selecting a subset of objects from a
larger set, where the order of the selection does not matter.
▶ The number of combinations of k objects chosen from a set of n
distinct objects is
 
n n n!
Ck = = .
k k!(n − k)!

▶ For example, if we have a set of 5 objects. The number of ways


to choose 3 objects without regard to order is:

5 5! 5×4
C3 = = = 10.
3!(5 − 3)! 2×1
Counting Principles

Partitioning
▶ Partitioning refers to the process of dividing a larger set into
smaller subsets or partitions.
Counting Principles

Partitioning
▶ Partitioning refers to the process of dividing a larger set into
smaller subsets or partitions.
▶ Each partition is a subset of the original set, and the union of all
partitions is equal to the original set.
Counting Principles

Partitioning
▶ Partitioning refers to the process of dividing a larger set into
smaller subsets or partitions.
▶ Each partition is a subset of the original set, and the union of all
partitions is equal to the original set.
▶ The number of ways of partitioning a set of n objects into k cells
with n1 elements in the first cell, n2 elements in the second, and
so forth is  
n n!
= ,
n1 , n2 , . . . , nk n1 !n2 ! . . . nk !
where n1 + n2 + . . . + nk = n.
Probability of an Event
Classical Definition
▶ If a random experiment can result in any one of N different
equally likely outcomes, and if exactly n of these outcomes
correspond to event A, then the probability of event A is

n
P(A) = .
N
Probability of an Event
Classical Definition
▶ If a random experiment can result in any one of N different
equally likely outcomes, and if exactly n of these outcomes
correspond to event A, then the probability of event A is

n
P(A) = .
N
Probability of an Event
Classical Definition
▶ If a random experiment can result in any one of N different
equally likely outcomes, and if exactly n of these outcomes
correspond to event A, then the probability of event A is

n
P(A) = .
N
Relative Frequency Definition
▶ If there is no basis to assume the outcomes are equally likely,
then we can repeat the experiment n times, record the outcomes
favourable to event A, say na and then take the likelihood of the
event A with sample space S, as

na
P(A) = .
n
Probability of an Event

Axiomatic Definition
The axiomatic definition of probability includes three axioms:
1. For any event A,
P(A) ≥ 0
Probability of an Event

Axiomatic Definition
The axiomatic definition of probability includes three axioms:
1. For any event A,
P(A) ≥ 0

2. The probability of the entire sample space

P(S) = 1
Probability of an Event

Axiomatic Definition
The axiomatic definition of probability includes three axioms:
1. For any event A,
P(A) ≥ 0

2. The probability of the entire sample space

P(S) = 1

3. For any collection of mutually exclusive events A1 , A2 , A3 , . . .,

P(A1 ∪ A2 ∪ . . .) = P(A1 ) + P(A2 ) + . . .


Properties of Probability
▶ If A and B are two events, then

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).


Properties of Probability
▶ If A and B are two events, then

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

▶ Example: Let the sample space be S = A ∪ B, where P(A) = 0.8


and P(B) = 0.5. Find P(A ∩ B).
Properties of Probability
▶ If A and B are two events, then

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

▶ Example: Let the sample space be S = A ∪ B, where P(A) = 0.8


and P(B) = 0.5. Find P(A ∩ B).
▶ Ans: 0.3
Properties of Probability
▶ If A and B are two events, then

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

▶ Example: Let the sample space be S = A ∪ B, where P(A) = 0.8


and P(B) = 0.5. Find P(A ∩ B).
▶ Ans: 0.3
▶ For any three events A, B, C,

P(A∪B∪C) = P(A)+P(B)+P(C)−P(A∩B)−P(A∩C)−P(B∩C)+P(A∩B∩C).
Properties of Probability
▶ If A and B are two events, then

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

▶ Example: Let the sample space be S = A ∪ B, where P(A) = 0.8


and P(B) = 0.5. Find P(A ∩ B).
▶ Ans: 0.3
▶ For any three events A, B, C,

P(A∪B∪C) = P(A)+P(B)+P(C)−P(A∩B)−P(A∩C)−P(B∩C)+P(A∩B∩C).

▶ If A1 , A2 , . . . , An are mutually exclusive, then

P(A1 ∪ A2 ∪ · · · ∪ An ) = P(A1 ) + P(A2 ) + · · · + P(An ).


Properties of Probability

▶ A collection of events {A1 , A2 , . . . , An } of a sample space S is


called a partition of S if A1 , A2 , . . . , An are mutually exclusive
and A1 ∪ A2 ∪ · · · ∪ An = S.
Properties of Probability

▶ A collection of events {A1 , A2 , . . . , An } of a sample space S is


called a partition of S if A1 , A2 , . . . , An are mutually exclusive
and A1 ∪ A2 ∪ · · · ∪ An = S.
▶ If A1 , A2 , . . . , An partition the sample space S, then

P(A1 ∪A2 ∪· · ·∪An ) = P(A1 )+P(A2 )+· · ·+P(An ) = P(S) = 1.


Properties of Probability

▶ A collection of events {A1 , A2 , . . . , An } of a sample space S is


called a partition of S if A1 , A2 , . . . , An are mutually exclusive
and A1 ∪ A2 ∪ · · · ∪ An = S.
▶ If A1 , A2 , . . . , An partition the sample space S, then

P(A1 ∪A2 ∪· · ·∪An ) = P(A1 )+P(A2 )+· · ·+P(An ) = P(S) = 1.

▶ For any event A, P(Ac ) = 1 − P(A).


Properties of Probability

▶ A collection of events {A1 , A2 , . . . , An } of a sample space S is


called a partition of S if A1 , A2 , . . . , An are mutually exclusive
and A1 ∪ A2 ∪ · · · ∪ An = S.
▶ If A1 , A2 , . . . , An partition the sample space S, then

P(A1 ∪A2 ∪· · ·∪An ) = P(A1 )+P(A2 )+· · ·+P(An ) = P(S) = 1.

▶ For any event A, P(Ac ) = 1 − P(A).


▶ In particular, the probability of null set P(∅) = 1 − P(S) = 0.
Properties of Probability

▶ A collection of events {A1 , A2 , . . . , An } of a sample space S is


called a partition of S if A1 , A2 , . . . , An are mutually exclusive
and A1 ∪ A2 ∪ · · · ∪ An = S.
▶ If A1 , A2 , . . . , An partition the sample space S, then

P(A1 ∪A2 ∪· · ·∪An ) = P(A1 )+P(A2 )+· · ·+P(An ) = P(S) = 1.

▶ For any event A, P(Ac ) = 1 − P(A).


▶ In particular, the probability of null set P(∅) = 1 − P(S) = 0.
▶ For any events A and B, if A ⊆ B, then P(A) ≤ P(B).
Properties of Probability

▶ A collection of events {A1 , A2 , . . . , An } of a sample space S is


called a partition of S if A1 , A2 , . . . , An are mutually exclusive
and A1 ∪ A2 ∪ · · · ∪ An = S.
▶ If A1 , A2 , . . . , An partition the sample space S, then

P(A1 ∪A2 ∪· · ·∪An ) = P(A1 )+P(A2 )+· · ·+P(An ) = P(S) = 1.

▶ For any event A, P(Ac ) = 1 − P(A).


▶ In particular, the probability of null set P(∅) = 1 − P(S) = 0.
▶ For any events A and B, if A ⊆ B, then P(A) ≤ P(B).
▶ For any event A ⊆ S, 0 ≤ P(A) ≤ 1.
Conditional Probability and Product Rule

Conditional Probability
▶ The conditional probability of event B given that A has occurred
is, denoted by P(B|A), and defined as

P(B ∩ A)
P(B|A) = ,
P(A)

provided that P(A) > 0.

Product Rule
Conditional Probability and Product Rule

Conditional Probability
▶ The conditional probability of event B given that A has occurred
is, denoted by P(B|A), and defined as

P(B ∩ A)
P(B|A) = ,
P(A)

provided that P(A) > 0.

Product Rule
▶ For any two events A and B,

P(A ∩ B) = P(A)P(B|A),

provided P(A) > 0.


Independence of Events

▶ Two events A and B are independent if and only if

P(B|A) = P(B) or P(A|B) = P(A),

assuming the existence of the conditional probabilities.


Independence of Events

▶ Two events A and B are independent if and only if

P(B|A) = P(B) or P(A|B) = P(A),

assuming the existence of the conditional probabilities.


▶ Two events A and B are independent if and only if

P(A ∩ B) = P(A)P(B).
Independence of Events

▶ Two events A and B are independent if and only if

P(B|A) = P(B) or P(A|B) = P(A),

assuming the existence of the conditional probabilities.


▶ Two events A and B are independent if and only if

P(A ∩ B) = P(A)P(B).

▶ Example: The probability that A hits the target is 1/4 and the
probability B hits is 2/5. What is the probability the target will be
hit if A and B each shoot at the target?
Independence of Events

▶ Two events A and B are independent if and only if

P(B|A) = P(B) or P(A|B) = P(A),

assuming the existence of the conditional probabilities.


▶ Two events A and B are independent if and only if

P(A ∩ B) = P(A)P(B).

▶ Example: The probability that A hits the target is 1/4 and the
probability B hits is 2/5. What is the probability the target will be
hit if A and B each shoot at the target?
▶ Ans: 11/20.
Generalized Product Rule

▶ If, in an experiment, the events A1 , A2 , . . . , Ak can occur, then

P(A1 ∩A2 ∩. . .∩Ak ) = P(A1 )P(A2 |A1 )P(A3 |A1 ∩A2 ) · · · P(Ak |A1 ∩A2 ∩· · ·∩Ak−1 ).
Generalized Product Rule

▶ If, in an experiment, the events A1 , A2 , . . . , Ak can occur, then

P(A1 ∩A2 ∩. . .∩Ak ) = P(A1 )P(A2 |A1 )P(A3 |A1 ∩A2 ) · · · P(Ak |A1 ∩A2 ∩· · ·∩Ak−1 ).

▶ Example: A box contains 12 items of which 4 are defective,


three items are drawn at random from the box one after the other.
Find the probability that all three are non-defective.
Generalized Product Rule

▶ If, in an experiment, the events A1 , A2 , . . . , Ak can occur, then

P(A1 ∩A2 ∩. . .∩Ak ) = P(A1 )P(A2 |A1 )P(A3 |A1 ∩A2 ) · · · P(Ak |A1 ∩A2 ∩· · ·∩Ak−1 ).

▶ Example: A box contains 12 items of which 4 are defective,


three items are drawn at random from the box one after the other.
Find the probability that all three are non-defective.
▶ Ans: 14/55.
Generalized Product Rule

▶ If, in an experiment, the events A1 , A2 , . . . , Ak can occur, then

P(A1 ∩A2 ∩. . .∩Ak ) = P(A1 )P(A2 |A1 )P(A3 |A1 ∩A2 ) · · · P(Ak |A1 ∩A2 ∩· · ·∩Ak−1 ).

▶ Example: A box contains 12 items of which 4 are defective,


three items are drawn at random from the box one after the other.
Find the probability that all three are non-defective.
▶ Ans: 14/55.
▶ Example: A box contains 20 balls of which 5 are red, 15 are
white. If 3 balls are selected at random and are drawn in
succession without replacement. Find the probability that all
three balls selected are red.
Generalized Product Rule

▶ If, in an experiment, the events A1 , A2 , . . . , Ak can occur, then

P(A1 ∩A2 ∩. . .∩Ak ) = P(A1 )P(A2 |A1 )P(A3 |A1 ∩A2 ) · · · P(Ak |A1 ∩A2 ∩· · ·∩Ak−1 ).

▶ Example: A box contains 12 items of which 4 are defective,


three items are drawn at random from the box one after the other.
Find the probability that all three are non-defective.
▶ Ans: 14/55.
▶ Example: A box contains 20 balls of which 5 are red, 15 are
white. If 3 balls are selected at random and are drawn in
succession without replacement. Find the probability that all
three balls selected are red.
▶ Ans: 1/114.
Independence of More Than Two Events

▶ A collection of events A = {A1 , . . . , An } is (mutually)


independent, if, for any sub-collection {Ai1 , · · · , Aik }, for k ≤ n,

P(Ai1 ∩ · · · ∩ Aik ) = P(Ai1 ) · · · P(Aik ).


Independence of More Than Two Events

▶ A collection of events A = {A1 , . . . , An } is (mutually)


independent, if, for any sub-collection {Ai1 , · · · , Aik }, for k ≤ n,

P(Ai1 ∩ · · · ∩ Aik ) = P(Ai1 ) · · · P(Aik ).

▶ If A1 , A2 , . . . , Ak are independent, then

P(A1 ∩ A2 ∩ . . . ∩ Ak ) = P(A1 )P(A2 ) · · · P(Ak ).


Law of Total Probability

▶ If the events B1 , B2 , . . . , Bk constitute a partition of the sample


space S such that P(Bi ) ̸= 0 for i = 1, 2, . . . , k, then for any
event A of S,

k
X k
X
P(A) = P(A ∩ Bi ) = P(Bi )P(A|Bi ).
i=1 i=1
Bayes’ Rule
▶ If the events B1 , B2 , · · · , Bk constitute a partition of the sample
space S such that P(Bi ) ̸= 0 for i = 1, 2, . . . , k, then for any
event A in S such that P(A) ̸= 0,

P(Br ∩ A)
P(Br |A) = Pk
i=1 P(Bi ∩ A)
P(Br )P(A|Br )
= Pk ,
i=1 P(Bi )P(A|Bi )

for r = 1, 2, . . . , k.
Bayes’ Rule
▶ If the events B1 , B2 , · · · , Bk constitute a partition of the sample
space S such that P(Bi ) ̸= 0 for i = 1, 2, . . . , k, then for any
event A in S such that P(A) ̸= 0,

P(Br ∩ A)
P(Br |A) = Pk
i=1 P(Bi ∩ A)
P(Br )P(A|Br )
= Pk ,
i=1 P(Bi )P(A|Bi )

for r = 1, 2, . . . , k.
▶ Example: A box contains 3 blue, 2 red marbles while another
box contains 2 blue, 5 red. A marble drawn at random from one
of the boxes terns out to be blue. What is the probability that it
come from the first box?
Bayes’ Rule
▶ If the events B1 , B2 , · · · , Bk constitute a partition of the sample
space S such that P(Bi ) ̸= 0 for i = 1, 2, . . . , k, then for any
event A in S such that P(A) ̸= 0,

P(Br ∩ A)
P(Br |A) = Pk
i=1 P(Bi ∩ A)
P(Br )P(A|Br )
= Pk ,
i=1 P(Bi )P(A|Bi )

for r = 1, 2, . . . , k.
▶ Example: A box contains 3 blue, 2 red marbles while another
box contains 2 blue, 5 red. A marble drawn at random from one
of the boxes terns out to be blue. What is the probability that it
come from the first box?
▶ Ans: 21/31.
Random Variables
Random Variable

What is a Random Variable?


▶ A random variable is a function that associates a real number
with each element in the sample space.
Random Variable

What is a Random Variable?


▶ A random variable is a function that associates a real number
with each element in the sample space.
Random Variable

What is a Random Variable?


▶ A random variable is a function that associates a real number
with each element in the sample space.

Example:
▶ Suppose a coin is tossed twice so the sample space is

S = {HH, HT, TH, TT}.


Random Variable

What is a Random Variable?


▶ A random variable is a function that associates a real number
with each element in the sample space.

Example:
▶ Suppose a coin is tossed twice so the sample space is

S = {HH, HT, TH, TT}.

▶ Let X represents the number of heads obtained in two tosses of


the coin, then
X = {2, 1, 1, 0}.
Types of Random Variables

Discrete Random Variable


▶ A discrete random variable is a random variable that takes on a
countable number of possible values.

Continuous Random Variable


Types of Random Variables

Discrete Random Variable


▶ A discrete random variable is a random variable that takes on a
countable number of possible values.
▶ The values that a discrete random variable can take on are
typically integers or a finite set of values.

Continuous Random Variable


Types of Random Variables

Discrete Random Variable


▶ A discrete random variable is a random variable that takes on a
countable number of possible values.
▶ The values that a discrete random variable can take on are
typically integers or a finite set of values.
▶ Example: The number of defective items in a production run, or
the number of heads in a series of coin flipping are discrete
random variables.
Continuous Random Variable
Types of Random Variables

Discrete Random Variable


▶ A discrete random variable is a random variable that takes on a
countable number of possible values.
▶ The values that a discrete random variable can take on are
typically integers or a finite set of values.
▶ Example: The number of defective items in a production run, or
the number of heads in a series of coin flipping are discrete
random variables.
Continuous Random Variable
▶ A random variable is called continuous random variable if it
takes the values on continuous scale.
Types of Random Variables

Discrete Random Variable


▶ A discrete random variable is a random variable that takes on a
countable number of possible values.
▶ The values that a discrete random variable can take on are
typically integers or a finite set of values.
▶ Example: The number of defective items in a production run, or
the number of heads in a series of coin flipping are discrete
random variables.
Continuous Random Variable
▶ A random variable is called continuous random variable if it
takes the values on continuous scale.
▶ Example: The height or weight of a person, the amount of
rainfall in a given day are continuous random variables.
Discrete Probability Distributions

Probability Mass Function (PMF)


▶ The set of ordered pairs (x, f (x)) is a probability function,
probability mass function or probability distribution of a discrete
random variable X if, for every x,
Discrete Probability Distributions

Probability Mass Function (PMF)


▶ The set of ordered pairs (x, f (x)) is a probability function,
probability mass function or probability distribution of a discrete
random variable X if, for every x,
▶ f (x) ≥ 0,
Discrete Probability Distributions

Probability Mass Function (PMF)


▶ The set of ordered pairs (x, f (x)) is a probability function,
probability mass function or probability distribution of a discrete
random variable X if, for every x,
▶ f (x) ≥ 0,

P
x f (x) = 1,
Discrete Probability Distributions

Probability Mass Function (PMF)


▶ The set of ordered pairs (x, f (x)) is a probability function,
probability mass function or probability distribution of a discrete
random variable X if, for every x,
▶ f (x) ≥ 0,

P
x f (x) = 1,
▶ PX (x) = P(X = x) = f (x).
Discrete Probability Distributions

Probability Mass Function (PMF)


▶ The set of ordered pairs (x, f (x)) is a probability function,
probability mass function or probability distribution of a discrete
random variable X if, for every x,
▶ f (x) ≥ 0,

P
x f (x) = 1,
▶ PX (x) = P(X = x) = f (x).
Discrete Probability Distributions

Probability Mass Function (PMF)


▶ The set of ordered pairs (x, f (x)) is a probability function,
probability mass function or probability distribution of a discrete
random variable X if, for every x,
▶ f (x) ≥ 0,

P
x f (x) = 1,
▶ PX (x) = P(X = x) = f (x).

Cumulative Distribution Function (CDF)


▶ Let X be a discrete random variable with PMF f (x). The
cumulative distribution function F(x) of X is defined as
X
F(x) = P(X ≤ x) = f (t), for − ∞ < x < ∞.
t≤x
Continuous Probability Distributions

Probability Density Function (PDF)


▶ The function f (x) is a probability density function of the continuous
random variable X defined over the set of real numbers if
Continuous Probability Distributions

Probability Density Function (PDF)


▶ The function f (x) is a probability density function of the continuous
random variable X defined over the set of real numbers if
▶ f (x) ≥ 0, for x ∈ R,
Continuous Probability Distributions

Probability Density Function (PDF)


▶ The function f (x) is a probability density function of the continuous
random variable X defined over the set of real numbers if
▶ f (x) ≥ 0, for x ∈ R,
▶ ∞ f (x)dx = 1,
R
−∞
Continuous Probability Distributions

Probability Density Function (PDF)


▶ The function f (x) is a probability density function of the continuous
random variable X defined over the set of real numbers if
▶ f (x) ≥ 0, for x ∈ R,
▶ ∞ f (x)dx = 1,
R
−∞
▶ P(a < X < b) = b f (x)dx.
R
a
Continuous Probability Distributions

Probability Density Function (PDF)


▶ The function f (x) is a probability density function of the continuous
random variable X defined over the set of real numbers if
▶ f (x) ≥ 0, for x ∈ R,
▶ ∞ f (x)dx = 1,
R
−∞
▶ P(a < X < b) = b f (x)dx.
R
a
Continuous Probability Distributions

Probability Density Function (PDF)


▶ The function f (x) is a probability density function of the continuous
random variable X defined over the set of real numbers if
▶ f (x) ≥ 0, for x ∈ R,
▶ ∞ f (x)dx = 1,
R
−∞
▶ P(a < X < b) = b f (x)dx.
R
a

Cumulative Distribution Function (CDF)


▶ The cumulative distribution function F(x) of a continuous random
variable X with density function f (x) is defined as
Z x
F(x) = P(X ≤ x) = f (t)dt, for − ∞ < x < ∞.
−∞
Continuous Probability Distributions

Probability Density Function (PDF)


▶ The function f (x) is a probability density function of the continuous
random variable X defined over the set of real numbers if
▶ f (x) ≥ 0, for x ∈ R,
▶ ∞ f (x)dx = 1,
R
−∞
▶ P(a < X < b) = b f (x)dx.
R
a

Cumulative Distribution Function (CDF)


▶ The cumulative distribution function F(x) of a continuous random
variable X with density function f (x) is defined as
Z x
F(x) = P(X ≤ x) = f (t)dt, for − ∞ < x < ∞.
−∞

▶ The PDF is the derivative of the CDF, i.e., f (x) = d


dx F(x) = F ′ (x).
Properties of CDF

▶ The CDF is a non-decreasing function, i.e,


if x1 < x2 , then F(x1 ) ≤ F(x2 ).
Properties of CDF

▶ The CDF is a non-decreasing function, i.e,


if x1 < x2 , then F(x1 ) ≤ F(x2 ).
▶ The CDF is a right-continuous function.
Properties of CDF

▶ The CDF is a non-decreasing function, i.e,


if x1 < x2 , then F(x1 ) ≤ F(x2 ).
▶ The CDF is a right-continuous function.
▶ limx→−∞ F(x) = 0.
Properties of CDF

▶ The CDF is a non-decreasing function, i.e,


if x1 < x2 , then F(x1 ) ≤ F(x2 ).
▶ The CDF is a right-continuous function.
▶ limx→−∞ F(x) = 0.
▶ limx→∞ F(x) = 1.
Properties of CDF

▶ The CDF is a non-decreasing function, i.e,


if x1 < x2 , then F(x1 ) ≤ F(x2 ).
▶ The CDF is a right-continuous function.
▶ limx→−∞ F(x) = 0.
▶ limx→∞ F(x) = 1.
▶ The range of the CDF is [0,1], i.e., the CDF takes on values
between 0 and 1, inclusive.
Properties of CDF

▶ The CDF is a non-decreasing function, i.e,


if x1 < x2 , then F(x1 ) ≤ F(x2 ).
▶ The CDF is a right-continuous function.
▶ limx→−∞ F(x) = 0.
▶ limx→∞ F(x) = 1.
▶ The range of the CDF is [0,1], i.e., the CDF takes on values
between 0 and 1, inclusive.
▶ P(a ≤ X ≤ b) = F(b) − F(a).
Properties of CDF

▶ The CDF is a non-decreasing function, i.e,


if x1 < x2 , then F(x1 ) ≤ F(x2 ).
▶ The CDF is a right-continuous function.
▶ limx→−∞ F(x) = 0.
▶ limx→∞ F(x) = 1.
▶ The range of the CDF is [0,1], i.e., the CDF takes on values
between 0 and 1, inclusive.
▶ P(a ≤ X ≤ b) = F(b) − F(a).
▶ P(X > a) = 1 − P(X ≤ a) = 1 − F(a).
Examples
▶ Example 1: Let X be a discrete random variable with PMF



1/2, x = 1

f (x) = 1/3, x = 2


1/6, x = 3.

Find the CDF of X.


Examples
▶ Example 1: Let X be a discrete random variable with PMF



1/2, x = 1

f (x) = 1/3, x = 2


1/6, x = 3.

Find the CDF of X.


▶ Ans: The CDF of X is

0,

 x<1


1/2, 1 ≤ x < 2

F(x) =
5/6, 2 ≤ x < 3




x ≥ 3.

1,
Examples

▶ Example 2: A shipment of 8 similar computers to a retail outlet


contains 3 defective. If a school makes a random purchase of 2
computers. Find the probability distribution of the number of
defective.
Examples

▶ Example 2: A shipment of 8 similar computers to a retail outlet


contains 3 defective. If a school makes a random purchase of 2
computers. Find the probability distribution of the number of
defective.
▶ Ans: Let X be the number of defective computers purchased by
the school. Then X can take the values from the set {0, 1, 2}.
Therefore, the PMF of X is

5/14, x = 0



f (x) = 15/28, x = 1


3/28, x = 2.

Examples

▶ Example 3: A continuous random variable X has a density


function 
Ce−3x , x > 0
f (x) =
0, x ≤ 0.

(i) Find the value of C.


(ii) Find P(1 < X < 2).
(iii) Find P(X ≥ 3).
(iv) Find P(X ≤ 1).
Examples

▶ Example 3: A continuous random variable X has a density


function 
Ce−3x , x > 0
f (x) =
0, x ≤ 0.

(i) Find the value of C.


(ii) Find P(1 < X < 2).
(iii) Find P(X ≥ 3).
(iv) Find P(X ≤ 1).
▶ Ans:
(i) C = 3.
(ii) P(1 < X < 2) = e−3 − e−6 .
(iii) P(X ≥ 3) = e−9 .
(iv) P(X ≤ 1) = 1 − e−3 .
Examples

▶ Example 4: Let X be a discrete random variable and


 4
4! 1
f (x) = , x = 0, 1, 2, 3, 4.
x!(4 − x)! 2

Is f a PMF? if so, find P({0, 1}).


Examples

▶ Example 4: Let X be a discrete random variable and


 4
4! 1
f (x) = , x = 0, 1, 2, 3, 4.
x!(4 − x)! 2

Is f a PMF? if so, find P({0, 1}).


▶ Ans: Since f (x) ≥ 0 for each x = 0, 1, 2, 3, 4 and

f (0) + f (1) + f (2) + f (3) + f (4) = 1.

Therefore f is a PMF. Also, P({0, 1}) = f (0) + f (1) = 5/16.


Examples

▶ Example 5: Let X be a continuous random variable with PDF



e−x , 0<x<∞
f (x) =
0, otherwise.

Find P(0 < X < 1).


Examples

▶ Example 5: Let X be a continuous random variable with PDF



e−x , 0<x<∞
f (x) =
0, otherwise.

Find P(0 < X < 1).


▶ Ans: P(0 < X < 1) = 1 − 1e .
Examples
▶ Example 6: Determine the value of k and the CDF of the continuous random
variable X, whose PDF is



 0, x<0

0≤x≤1



 kx,

f (x) = k, 1≤x≤2


3k − kx,

 2≤x≤3



0, x > 3.

Examples
▶ Example 6: Determine the value of k and the CDF of the continuous random
variable X, whose PDF is



 0, x<0

0≤x≤1



 kx,

f (x) = k, 1≤x≤2


3k − kx,

 2≤x≤3



0, x > 3.

▶ Ans: k = 1
2
and CDF is



0, x<0
2

x
0≤x≤1


 ,
4

x
F(x) = 2 − 4 ,1
1≤x≤2

−x2

3x 5
+ 2 − 4, 2 ≤ x ≤ 3


4




1, x > 3.

Joint Distributions
Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if

Joint Density Function


Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

Joint Density Function


Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

P P
x y f (x, y) = 1,

Joint Density Function


Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

P P
x y f (x, y) = 1,
▶ P(X = x, Y = y) = f (x, y).

Joint Density Function


Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

P P
x y f (x, y) = 1,
▶ P(X = x, Y = y) = f (x, y).

▶ For any region A, in the xy plane, P[(X, Y) ∈ A] =


PP
A f (x, y).

Joint Density Function


Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

P P
x y f (x, y) = 1,
▶ P(X = x, Y = y) = f (x, y).

▶ For any region A, in the xy plane, P[(X, Y) ∈ A] =


PP
A f (x, y).

Joint Density Function


▶ The function f (x, y) is a joint density function of the continuous random
variables X and Y if
Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

P P
x y f (x, y) = 1,
▶ P(X = x, Y = y) = f (x, y).

▶ For any region A, in the xy plane, P[(X, Y) ∈ A] =


PP
A f (x, y).

Joint Density Function


▶ The function f (x, y) is a joint density function of the continuous random
variables X and Y if
▶ f (x, y) ≥ 0, for all (x, y),
Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

P P
x y f (x, y) = 1,
▶ P(X = x, Y = y) = f (x, y).

▶ For any region A, in the xy plane, P[(X, Y) ∈ A] =


PP
A f (x, y).

Joint Density Function


▶ The function f (x, y) is a joint density function of the continuous random
variables X and Y if
▶ f (x, y) ≥ 0, for all (x, y),
▶ ∞ ∞ f (x, y)dxdy = 1,
R R
−∞ −∞
Joint Probability Distributions

Joint Probability Mass Function


▶ The function f (x, y) is a joint probability distribution or probability mass
function of the discrete random variables X and Y if
▶ f (x, y) ≥ 0 for all (x, y),

P P
x y f (x, y) = 1,
▶ P(X = x, Y = y) = f (x, y).

▶ For any region A, in the xy plane, P[(X, Y) ∈ A] =


PP
A f (x, y).

Joint Density Function


▶ The function f (x, y) is a joint density function of the continuous random
variables X and Y if
▶ f (x, y) ≥ 0, for all (x, y),
▶ ∞ ∞ f (x, y)dxdy = 1,
R R
−∞ −∞
▶ P[(X, Y) ∈ A] =
RR
A
f (x, y)dxdy, for any region A in the xy plane.
Marginal Distributions

Discrete Case
▶ For discrete random variables X and Y,
Marginal Distributions

Discrete Case
▶ For discrete random variables X and Y,
▶ The marginal distribution of X is g(x) = fX (x) =
P
y f (x, y),
Marginal Distributions

Discrete Case
▶ For discrete random variables X and Y,
▶ The marginal distribution of X is g(x) = fX (x) = y f (x, y),
P

▶ The marginal distribution of Y is h(y) = fY (y) = x f (x, y).


P
Marginal Distributions

Discrete Case
▶ For discrete random variables X and Y,
▶ The marginal distribution of X is g(x) = fX (x) = y f (x, y),
P

▶ The marginal distribution of Y is h(y) = fY (y) = x f (x, y).


P
Marginal Distributions

Discrete Case
▶ For discrete random variables X and Y,
▶ The marginal distribution of X is g(x) = fX (x) = y f (x, y),
P

▶ The marginal distribution of Y is h(y) = fY (y) = x f (x, y).


P

Continuous Case
▶ For Continuous random variables X and Y,
Marginal Distributions

Discrete Case
▶ For discrete random variables X and Y,
▶ The marginal distribution of X is g(x) = fX (x) = y f (x, y),
P

▶ The marginal distribution of Y is h(y) = fY (y) = x f (x, y).


P

Continuous Case
▶ For Continuous random variables X and Y,
R∞
▶ The marginal distribution of X is g(x) = fX (x) = f (x, y)dy,
−∞
Marginal Distributions

Discrete Case
▶ For discrete random variables X and Y,
▶ The marginal distribution of X is g(x) = fX (x) = y f (x, y),
P

▶ The marginal distribution of Y is h(y) = fY (y) = x f (x, y).


P

Continuous Case
▶ For Continuous random variables X and Y,
▶ The marginal distribution of X is g(x) = fX (x) = ∞ f (x, y)dy,
R
R −∞
▶ The marginal distribution of Y is h(y) = fY (y) = ∞ f (x, y)dx.
−∞
Conditional Distributions

▶ Let X and Y be two random variables, discrete or continuous.


The conditional distribution of the random variable Y given that
X = x is
f (x, y)
f (y|x) = fY|X (y|x) = ,
g(x)
provided g(x) > 0.
Conditional Distributions

▶ Let X and Y be two random variables, discrete or continuous.


The conditional distribution of the random variable Y given that
X = x is
f (x, y)
f (y|x) = fY|X (y|x) = ,
g(x)
provided g(x) > 0.
▶ Similarly, the conditional distribution of X given that Y = y is

f (x, y)
f (x|y) = fX|Y (x|y) = ,
h(y)

provided h(y) > 0.


Independence or Statistical Independence

▶ Let X and Y be two random variables, discrete or continuous,


with joint probability distribution f (x, y) and marginal
distributions g(x) and h(y), respectively.
Independence or Statistical Independence

▶ Let X and Y be two random variables, discrete or continuous,


with joint probability distribution f (x, y) and marginal
distributions g(x) and h(y), respectively.
▶ The random variables X and Y are said to be statistically
independent if and only if

f (x, y) = g(x)h(y) = fX (x)fY (y)

for all (x, y) within their range.


Example 1:

Two ballpoint pens are selected at random from a box that contains 3
blue pens, 2 red pens, and 3 green pens. If X is the number of blue
pens selected and Y is the number of red pens selected.
(a) Find the joint probability function f (x, y).
(b) Find P[(X, Y) ∈ A], where A is the region {(x, y)|x + y ≤ 1}.
(c) Find the marginal distributions of X and Y.
(d) Find the conditional distribution of X, given that Y = 1, and use
it to determine P(X = 0|Y = 1).
(e) Show that the random variables X and Y are not statistically
independent.
Example 2:

Let the joint density function of the continuous random variables X


and Y is

 2 (2x + 3y), 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
f (x, y) = 5
0, otherwise.

R∞ R∞
(a) Show that −∞ −∞ f (x, y)dxdy = 1.
(b) Find P[(X, Y) ∈ A], where A = {(x, y)|0 < x < 21 , 14 < y < 12 }.
(c) Find the marginal distributions of X and Y.
(d) Find the conditional densities f (y|x), f (x|y), and then evaluate
P( 14 < X < 21 |Y = 13 ).
(e) Show that X and Y are not statistically independent.
Example 3:

The joint density of the random variables X and Y is given as



4xye−(x2 +y2 ) x, y ≥ 0
f (x, y) =
0 elsewhere.

Test whether X and Y are statistically independent.


Mathematical Expectation
Mean of a Random Variable

▶ For a discrete random variable X with probability mass function


f (x), the expected value or mean is
X
µ = E(X) = xf (x).
x
Mean of a Random Variable

▶ For a discrete random variable X with probability mass function


f (x), the expected value or mean is
X
µ = E(X) = xf (x).
x

▶ For a continuous random variable X with probability density


function f (x), the expected value or mean is
Z ∞
µ = E(X) = xf (x)dx.
−∞
Mean of a function of a Random Variable

Law of the Unconscious Statistician (LOTUS)


Let X be a random variable with probability distribution f (x).
Mean of a function of a Random Variable

Law of the Unconscious Statistician (LOTUS)


Let X be a random variable with probability distribution f (x).
▶ If X is discrete, the expected value of the random variable g(X) is
X
µg(X) = E[g(X)] = g(x)f (x).
x
Mean of a function of a Random Variable

Law of the Unconscious Statistician (LOTUS)


Let X be a random variable with probability distribution f (x).
▶ If X is discrete, the expected value of the random variable g(X) is
X
µg(X) = E[g(X)] = g(x)f (x).
x

▶ If g(X) = X 2 , then µX 2 = E(X 2 ) = x2 f (x).


P
x
Mean of a function of a Random Variable

Law of the Unconscious Statistician (LOTUS)


Let X be a random variable with probability distribution f (x).
▶ If X is discrete, the expected value of the random variable g(X) is
X
µg(X) = E[g(X)] = g(x)f (x).
x

▶ If g(X) = X 2 , then µX 2 = E(X 2 ) = x2 f (x).


P
x

▶ If X is continuous, the expected value of the random variable


g(X) is Z ∞
µg(X) = E[g(X)] = g(x)f (x)dx.
−∞
Mean of a function of a Random Variable

Law of the Unconscious Statistician (LOTUS)


Let X be a random variable with probability distribution f (x).
▶ If X is discrete, the expected value of the random variable g(X) is
X
µg(X) = E[g(X)] = g(x)f (x).
x

▶ If g(X) = X 2 , then µX 2 = E(X 2 ) = x2 f (x).


P
x

▶ If X is continuous, the expected value of the random variable


g(X) is Z ∞
µg(X) = E[g(X)] = g(x)f (x)dx.
−∞

R∞
▶ If g(X) = X 2 , then µX 2 = E(X 2 ) = x2 f (x).
−∞
Mean of a function of two Random Variables

Let X and Y be the random variables with joint probability


distribution f (x, y).
▶ If X and Y are discrete, the mean, or expected value of the
random variable g(X, Y) is
XX
µg(X,Y) = E[g(X, Y)] = g(x, y)f (x, y).
x y
Mean of a function of two Random Variables

Let X and Y be the random variables with joint probability


distribution f (x, y).
▶ If X and Y are discrete, the mean, or expected value of the
random variable g(X, Y) is
XX
µg(X,Y) = E[g(X, Y)] = g(x, y)f (x, y).
x y

▶ If X and Y are continuous, the mean, or expected value of the


random variable g(X, Y) is
Z ∞ Z ∞
µg(X,Y) = E[g(X, Y)] = g(x, y)f (x, y)dxdy.
−∞ −∞
Variance and Standard Deviation of a Random Variable
▶ Let X be a random variable with probability distribution f (x) and
mean µ.
Variance and Standard Deviation of a Random Variable
▶ Let X be a random variable with probability distribution f (x) and
mean µ.
▶ If X is discrete, the variance of X is
X
Var(X) = σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x).
x
Variance and Standard Deviation of a Random Variable
▶ Let X be a random variable with probability distribution f (x) and
mean µ.
▶ If X is discrete, the variance of X is
X
Var(X) = σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x).
x

▶ If X is continuous, the variance of X is


Z ∞
Var(X) = σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x)dx.
−∞
Variance and Standard Deviation of a Random Variable
▶ Let X be a random variable with probability distribution f (x) and
mean µ.
▶ If X is discrete, the variance of X is
X
Var(X) = σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x).
x

▶ If X is continuous, the variance of X is


Z ∞
Var(X) = σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x)dx.
−∞

▶ The variance of the random variable X is also given by

σ 2 = E(X 2 ) − µ2 = E(X 2 ) − [E(X)]2


Variance and Standard Deviation of a Random Variable
▶ Let X be a random variable with probability distribution f (x) and
mean µ.
▶ If X is discrete, the variance of X is
X
Var(X) = σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x).
x

▶ If X is continuous, the variance of X is


Z ∞
Var(X) = σ 2 = E[(X − µ)2 ] = (x − µ)2 f (x)dx.
−∞

▶ The variance of the random variable X is also given by

σ 2 = E(X 2 ) − µ2 = E(X 2 ) − [E(X)]2


▶ The positive square root of the variance, σ is called the standard
deviation of X.
Variance of a Function of a Random Variable

Let X be a random variable with probability distribution f (x).


▶ If X is discrete, the variance of the random variable g(X) is

2
Var[g(X)] = σg(X) = E[(g(X) − µg(X) )2 ]
X
= (g(x) − µg(X) )2 f (x).
x
Variance of a Function of a Random Variable

Let X be a random variable with probability distribution f (x).


▶ If X is discrete, the variance of the random variable g(X) is

2
Var[g(X)] = σg(X) = E[(g(X) − µg(X) )2 ]
X
= (g(x) − µg(X) )2 f (x).
x

▶ If X is continuous, the variance of the random variable g(X) is

2
Var[g(X)] = σg(X) = E[(g(X) − µg(X) )2 ]
Z ∞
= (g(x) − µg(X) )2 f (x)dx.
−∞
Covariance of Random Variables

▶ Let X and Y be random variables with joint probability


distribution f (x, y), and with means µX and µY , respectively.
Covariance of Random Variables

▶ Let X and Y be random variables with joint probability


distribution f (x, y), and with means µX and µY , respectively.
▶ If X and Y are discrete, the covariance of X and Y is

Cov(X, Y) = σXY = E[(X − µX )(Y − µY )]


XX
= (x − µX )(y − µy )f (x, y).
x y
Covariance of Random Variables

▶ Let X and Y be random variables with joint probability


distribution f (x, y), and with means µX and µY , respectively.
▶ If X and Y are discrete, the covariance of X and Y is

Cov(X, Y) = σXY = E[(X − µX )(Y − µY )]


XX
= (x − µX )(y − µy )f (x, y).
x y

▶ If X and Y are continuous, the covariance of X and Y is

Cov(X, Y) = σXY = E[(X − µX )(Y − µY )]


Z ∞Z ∞
= (x − µX )(y − µY )f (x, y)dxdy.
−∞ −∞
Covariance of Random Variables

▶ The covariance of two random variables X and Y with means µX


and µY , respectively, is given by

σXY = E(XY) − µX µY .
Covariance of Random Variables

▶ The covariance of two random variables X and Y with means µX


and µY , respectively, is given by

σXY = E(XY) − µX µY .

▶ We can also rewrite it as

Cov(X, Y) = E(XY) − E(X)E(Y).


Covariance of Random Variables

▶ The covariance of two random variables X and Y with means µX


and µY , respectively, is given by

σXY = E(XY) − µX µY .

▶ We can also rewrite it as

Cov(X, Y) = E(XY) − E(X)E(Y).

▶ Moreover, if a, b, c, d are constants, then

Cov(aX + b, cY + d) = ac Cov(X, Y).


The Correlation Coefficient

▶ Let X and Y be random variables with covariance σXY and


standard deviations σX and σY , respectively. The correlation
coefficient of X and Y is

σXY
ρXY = .
σX σY
The Correlation Coefficient

▶ Let X and Y be random variables with covariance σXY and


standard deviations σX and σY , respectively. The correlation
coefficient of X and Y is

σXY
ρXY = .
σX σY

▶ If X and Y are independent, then ρXY = 0.


Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


▶ If a = 0, then E(b) = b.
Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


▶ If a = 0, then E(b) = b.
▶ If b = 0, then E(aX) = aE(X).
Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


▶ If a = 0, then E(b) = b.
▶ If b = 0, then E(aX) = aE(X).
▶ If a = 1, then E(X + b) = E(X) + b.
Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


▶ If a = 0, then E(b) = b.
▶ If b = 0, then E(aX) = aE(X).
▶ If a = 1, then E(X + b) = E(X) + b.

▶ E[g(X) ± h(X)] = E[g(X)] ± E[h(X)].


Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


▶ If a = 0, then E(b) = b.
▶ If b = 0, then E(aX) = aE(X).
▶ If a = 1, then E(X + b) = E(X) + b.

▶ E[g(X) ± h(X)] = E[g(X)] ± E[h(X)].


▶ E[g(X, Y) ± h(X, Y)] = E[g(X, Y)] ± E[h(X, Y)].
Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


▶ If a = 0, then E(b) = b.
▶ If b = 0, then E(aX) = aE(X).
▶ If a = 1, then E(X + b) = E(X) + b.

▶ E[g(X) ± h(X)] = E[g(X)] ± E[h(X)].


▶ E[g(X, Y) ± h(X, Y)] = E[g(X, Y)] ± E[h(X, Y)].
▶ E[X ± Y] = E[X] ± E[Y].
Properties of Mean and Variance

▶ For any constants a and b, E(aX + b) = aE(X) + b.


▶ If a = 0, then E(b) = b.
▶ If b = 0, then E(aX) = aE(X).
▶ If a = 1, then E(X + b) = E(X) + b.

▶ E[g(X) ± h(X)] = E[g(X)] ± E[h(X)].


▶ E[g(X, Y) ± h(X, Y)] = E[g(X, Y)] ± E[h(X, Y)].
▶ E[X ± Y] = E[X] ± E[Y].

▶ If X and Y are independent random variables, then

E(XY) = E(X)E(Y).

Moreover, σXY = Cov(X, Y) = 0.


Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .
Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .

▶ If b = 0, then σaX+c
2
= a2 σX2
Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .

▶ If b = 0, then σaX+c
2
= a2 σX2
▶ If a = 1 and b = 0, then σX+c
2
= σX2 .
Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .

▶ If b = 0, then σaX+c
2
= a2 σX2
▶ If a = 1 and b = 0, then σX+c
2
= σX2 .
▶ If b = c = 0, then σaX
2
= a2 σX2 .
Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .

▶ 2
If b = 0, then σaX+c = a2 σX2
▶ 2
If a = 1 and b = 0, then σX+c = σX2 .
▶ 2
If b = c = 0, then σaX = a2 σX2 .
▶ If a = b = 0, then σc2 = 0.
Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .

▶ 2
If b = 0, then σaX+c = a2 σX2
▶ 2
If a = 1 and b = 0, then σX+c = σX2 .
▶ 2
If b = c = 0, then σaX = a2 σX2 .
▶ If a = b = 0, then σc2 = 0.
▶ For independent random variables X and Y,
Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .

▶ 2
If b = 0, then σaX+c = a2 σX2
▶ 2
If a = 1 and b = 0, then σX+c = σX2 .
▶ 2
If b = c = 0, then σaX = a2 σX2 .
▶ If a = b = 0, then σc2 = 0.
▶ For independent random variables X and Y,
▶ σaX+bY
2
= a2 σX2 + b2 σY2 .
Properties of Mean and Variance

▶ If X and Y are random variables with joint probability


distribution f (x, y) and a, b, and c are constants, then

2
σaX+bY+c = a2 σX2 + b2 σY2 + 2abσXY .

▶ 2
If b = 0, then σaX+c = a2 σX2
▶ 2
If a = 1 and b = 0, then σX+c = σX2 .
▶ 2
If b = c = 0, then σaX = a2 σX2 .
▶ If a = b = 0, then σc2 = 0.
▶ For independent random variables X and Y,
▶ σaX+bY
2
= a2 σX2 + b2 σY2 .
▶ σaX−bY = a2 σX2 + b2 σY2 .
2
Moment Generating Function
Moments

▶ The rth moment about the origin of a random variable X is


denote by µ′r and defined by

P xr f (x), if X is discrete
µr = E(X ) = R ∞x
′ r
r
−∞ x f (x)dx, if X is continuous.

Moments

▶ The rth moment about the origin of a random variable X is


denote by µ′r and defined by

P xr f (x), if X is discrete
µr = E(X ) = R ∞x
′ r
r
−∞ x f (x)dx, if X is continuous.

▶ If r = 0, then µ′0 = E(X 0 ) = E(1) = 1.


Moments

▶ The rth moment about the origin of a random variable X is


denote by µ′r and defined by

P xr f (x), if X is discrete
µr = E(X ) = R ∞x
′ r
r
−∞ x f (x)dx, if X is continuous.

▶ If r = 0, then µ′0 = E(X 0 ) = E(1) = 1.


▶ If r = 1, then µ′1 = E(X) = µ.
Moments

▶ The rth moment about the origin of a random variable X is


denote by µ′r and defined by

P xr f (x), if X is discrete
µr = E(X ) = R ∞x
′ r
r
−∞ x f (x)dx, if X is continuous.

▶ If r = 0, then µ′0 = E(X 0 ) = E(1) = 1.


▶ If r = 1, then µ′1 = E(X) = µ.
▶ If r = 2, then µ′2 = E(X 2 ) = Var(X) + [E(X)]2 = σ 2 + µ2 .
Therefore,
σ 2 = µ′2 − µ′2
1.
Moment Generating Function (MGF)

▶ The MGF of a random variable completely describes the nature


of the distribution.
Moment Generating Function (MGF)

▶ The MGF of a random variable completely describes the nature


of the distribution.
▶ The moment-generating function of the random variable X is
given by E(etX ) and is denoted by MX (t). Hence,

P etx f (x), if X is discrete
MX (t) = E(e ) = R ∞x
tX
tx
−∞ e f (x)dx, if X is continuous.

Moment Generating Function (MGF)

▶ The MGF of a random variable completely describes the nature


of the distribution.
▶ The moment-generating function of the random variable X is
given by E(etX ) and is denoted by MX (t). Hence,

P etx f (x), if X is discrete
MX (t) = E(e ) = R ∞x
tX
tx
−∞ e f (x)dx, if X is continuous.

▶ Moment Formula: Let X be a random variable with moment


generating function MX (t). Then

dr MX (t)
= µ′r .
dtr t=0
Properties of MGF

▶ Let X be a random variable and a be a constant.


Properties of MGF

▶ Let X be a random variable and a be a constant.


▶ MX+a (t) = eat MX (t).
Properties of MGF

▶ Let X be a random variable and a be a constant.


▶ MX+a (t) = eat MX (t).
▶ MaX (t) = MX (at).
Properties of MGF

▶ Let X be a random variable and a be a constant.


▶ MX+a (t) = eat MX (t).
▶ MaX (t) = MX (at).

▶ Uniqueness Theorem: Let X and Y be two random variables


with moment generating functions MX (t) and MY (t),
respectively. If MX (t) = MY (t) for all values of t, then X and Y
have the same probability distribution.
Properties of MGF

▶ Let X be a random variable and a be a constant.


▶ MX+a (t) = eat MX (t).
▶ MaX (t) = MX (at).

▶ Uniqueness Theorem: Let X and Y be two random variables


with moment generating functions MX (t) and MY (t),
respectively. If MX (t) = MY (t) for all values of t, then X and Y
have the same probability distribution.
▶ If X and Y are two independent random variables with moment
generating functions MX (t) and MY (t), respectively. Then

MX+Y (t) = MX (t)MY (t).

You might also like