Probability & probability
distributions
Appendix A in G&P
Outcomes
• To show that you have mastered this topic you should be able to:
• Define/describe key terms and concepts such as random variable,
conditional probability, statistical independence
• Determine the probability of an r.v. having specific values
• Find marginal distributions
• Determine conditional PDFs
Experiment vs Population
• In order to estimate a regression model, you need to measure the
variables
• Any process of observation or measurement:
• Which has more than one outcome,
• And where there is uncertainty about which outcome will actually realise, is a
random experiment.
• Example, toss a coin – head or tail.
• Or throw a dice – 1, 2, 3, 4, 5, or 6.
• All the possible outcomes of such an experiment is called the
population or sample space.
Experiment vs Population
• If two coins are flicked, the possible outcomes are: head-head; head-
tail; tail-tail; tail-head.
• The population is all the outcomes {HH, HT, TT, TH}.
• Sample point is one outcome in the sample space, e.g. TH.
• An event is a certain collection of outcomes (or subset of the sample
space), e.g. HH and HT.
• Venn diagrams can be used to illustrate these concepts – see figure A-
1.
• Rectangle: Sample space; Circles: Events
• (a) A vs A’
• (b) A union B (A ∪ B)
• (c) intersection (A ∩ B)
• (d) mutually exclusive (A ∩ B = 0)
Random variables
• To write the outcome of an experiment in terms of HH, TT, TH and HT takes time and
space. We would rather like to work with numbers.
• A variable is the outcome of an experiment, described numerically.
• Suppose the variable is the number of “heads” in a two-coin toss:
Coin 1 Coin 2 # of heads
T T 0
T H 1
H T 1
H H 2
• A variable is random or stochastic, when its numerical value is determined by the
outcome of an experiment.
• A discrete random variable: takes on a finite number of values e.g. 0, 1, 2
• A continuous random variable: takes on values in some interval e.g. a height range of
60-72 inches.
Recap
• You now know:
• The outcome / result of an experiment is a sample point.
• When we express this result of the experiment as a numerical value, it is
called a random variable.
• But in any experiment you are not sure about the outcome – thus
there are uncertainties.
• Because of uncertainty, you must ask yourself, what is the chance that
this outcome will realise?
• The next step is to determine the probability that a certain outcome
will realise.
Probability
• The probability of an event is given by the relative frequency with
which the event occur.
• The probability that A will occur is:
• P(A) = number of outcomes favourable to A
• total number of outcomes
• E.g. if a coin is tossed, the probability that it would land on tail is (1
out of 2) ½ or 50%.
• Probability is written as a number between 0 and 1 or as a
percentage.
Relative frequency & probability
• What happens if the outcome of an experiment is not finite or equally
likely?
• Absolute frequency – number of occurrences of a given event
• Relative frequency – absolute number divided by total number of
occurrences = probability
• In n observations, if m is favourable to event A, the probability of A
(P(A)) = m/n
Probability
• The closer the probability to 0, the less likely the event is – A
probability close to 1, means that it is highly likely that the event will
realise.
• Characteristics of probabilities:
• Probability lies between 0 and 1.
• If event A, B, C are mutually exclusive, the probability that any one of them
will occur = sum of their indiv. probabilities
• The sum of all the probabilities in a sample space must equal 1.
Rules of probability
• Events are statistically independent if the prob of them occurring
together = product of their individual (or marginal) probs
• P(ABC) = P(A)P(B)P(C) [joint prob]
• What if events are not mutually exclusive?
• P(A+B) = P(A) + P(B) – P(AB)
• For every event (A), there is a complement (A')
• P(A+A') = 1 and P(AA') = 0
• Conditional probability – what is the prob. of event A occurring, if
event B has already occurred?
Random variables and their probability
distributions
• The possible values taken by a random variable and the probabilities
of occurrence of those values
• The probability mass function of a discrete random variable X is f(X=xi)
= P(X=xi) and
0 f ( xi ) 1 f (x ) 1
x
i
• P(X=xi): the probability that the discrete random variable X takes the
numerical value xi.
Random variables and their probability
distributions
• Example: r.v X is the number of “heads” obtained in tossing 2 coins –
see Example A.5
Number of “heads” PF
X f(X)
0 ¼
1 ½
2 ¼
Sum 1
• It can be illustrated graphically as the probability mass function
PMF of a discrete random variable
Random variables and their probability
distributions
• For a continuous random variable the probability is measured over an
interval. Now the PMF is called the Probability Density Function (PDF).
Cumulative Distribution Functions
• The sum of the PDF or PMF of X is the cumulative distribution
function (CDF).
F ( X ) ( X x)
• The CDF is the sum of the PDF for values of X less than or equal to a
given x.
Number of “heads” PDF CDF
X f(X) f(X)
0 ¼ ¼
1 ½ ¾
2 ¼ 1
Discrete and continuous CDFs
Multivariate probability density functions
• An event may be described by more than one r.v.
• The case of a bivariate probability distribution – number of PC’s and
printers sold (Tables A2 and A3)
Multivariate probability density functions
• What is measured is the number of days that different combinations
of PCs and printers are sold.
• It is a joint frequency distribution: The number of times that 4 PCs
and 4 printers are sold, is 30 (the absolute frequency).
• By dividing it by the total number of days, the relative frequency can
be determined.
• And this can be interpreted as a joint probability.
• The probability that 4 PCs and 4 printers are sold, is 30/200 = 0.15, or
15%
Multivariate probability density functions
• The joint PMF is: f ( X , Y ) P ( X x, Y y )
• with properties:
f ( X ,Y ) 0
x y
f ( X ,Y ) 1
Marginal probabilities
• In relation to the joint probability f(X,Y), f(X) and f(Y) are individual or
marginal probabilities.
• To determine the marginal probability of X:
• Add the joint probabilities that coincide with given X-values, regardless of the
values taken by Y.
• Thus, sum down the column.
• To find the marginal probability of Y, sum across the row.
Conditional Probability Functions
• The probability that A will take place should be adapted once more
information is available regarding a related event B.
• Thus, a conditional probability has to be determined.
• E.g. What is the probability that 4 printers will be sold, conditional that 4
PCs are sold? f Y 4andX 4
f Y 4 | X 4
f X 4
• = 0,15/0,32 = 0,47
•
Statistical Independence
• Two events are statistically independent if the value taken by one
variable has no effect on the value taken by the other.
• E.g. To smoke and the probability to obtain lung cancer are not
independent events, but ice cream sales and shark attacks at the
beach may be independent events.
• Formally, X and Y are statistically independent when their joint PDF
can be expressed as the product of their marginal PDFs.
• Later with regression analysis, we assume that variables are
independent stochastic variables
Statistical Independence
• Events A and B are statistically independent when the following
conditions hold:
P A | B P A
PB | A PB
P A B P APB
For Exercise/ activity
• Study Guide: Activities at the end of Unit 2
• Gujarati and Porter: Questions and Problems at the end of Appendix A
• The following equation appears on Slide 11, why do we subtract
P(AB)? [Tip: look at the Venn diagrams on slide 5]