KEMBAR78
Lecture 1-1 - Review of Probability | PDF | Probability Distribution | Random Variable
0% found this document useful (0 votes)
21 views36 pages

Lecture 1-1 - Review of Probability

Uploaded by

Hannes Du
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views36 pages

Lecture 1-1 - Review of Probability

Uploaded by

Hannes Du
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

ECON 120B Econometrics

Lecture 1 Part 1: Review of Probability

Xinwei Ma

Department of Economics
UC San Diego

Spring 2021

c Xinwei Ma 2021 0/30


Outline

Random Variables, Expectation, and Variance

Joint Distributin, Covaraince, and Correlation

Conditional Distribution and Conditional Expectation

Some Important Properties

c Xinwei Ma 2021 0/30


Random Variables, Expectation, and Variance

 Probability theory starts with random variables, which provide numerical summary of
random outcomes.

• EXAMPLE. Gambling outcome, crop yield, GDP next year, test score, etc.

• We use uppercase letters, such as X , Y and Z to denote random variables, and lowercase letters,
x, y and z for their realizations.

 A random variable is fully characterized by its cumulative distribution function (CDF)

FX (x) = P[X ≤ x]

 Depending on whether a random variable is discrete or continuous, we also use the


probability mass function (PMF) or the probability density function (PDF) to
summarize its distribution.

c Xinwei Ma 2021 1/30


Random Variables, Expectation, and Variance

 A discrete random variable takes values in a discrete set.

• It can take finitely or infinitely many values.

• We generally use the probability mass function (PMF) to characterize its distribution
1.0
0.8
0.6
0.4
0.2
0.0
fX (x) = P[X = x]

−2 −1 0 1 2 3 4

c Xinwei Ma 2021 2/30


Random Variables, Expectation, and Variance

 The mean (expectation) of a discrete random variable X is


X X
E[X ] = x · P[X = x] = x · fX (x).
x x

• It is the weighted average of all possible values of X .

• It is the long-run average value of X over repeated realizations.

 EXAMPLE.

x -1 2 3
fX (x) 0.3 0.5 0.2
x · fX (x) -0.3 1.0 0.6

which gives E[X ] = 1.3.

c Xinwei Ma 2021 3/30


Random Variables, Expectation, and Variance

 The variance of a discrete random variable X is


X X
V[X ] = (x − E[X ])2 · P[X = x] = (x − E[X ])2 · fX (x).
x x

• It is a measure of the squared spread of the distribution.

 A related concept is the standard deviation, which is


p
sd[X ] = V[X ].

• It is a measure of the spread of the distribution.

 EXAMPLE. (Recall E[X ] = 1.3)

x -1 2 3
fX (x) 0.3 0.5 0.2
x − E[X ] -2.3 0.7 1.7
(x − E[X ])2 5.29 0.49 2.89
(x − E[X ])2 fX (x) 1.587 0.245 0.578

which gives V[X ] = 2.41 and sd[X ] = 1.55.


c Xinwei Ma 2021 4/30
Random Variables, Expectation, and Variance

 Bernoulli distribution

• Domain (what values does X take)


0 and 1

• Probability mass function (PMF)


P[X = 0] = 1 − p, P[X = 1] = p

• Parameters of this distribution


0<p<1

• Mean
E[X ] = p

• Variance
V[X ] = p(1 − p)

c Xinwei Ma 2021 5/30


Random Variables, Expectation, and Variance

 Binomial distribution

• Domain (what values does X take)


0, 1, 2, · · · , n

• Probability mass function (PMF)


n  n  n!
x n−x
P[X = x] = p (1 − p) , where =
x x x!(n − x)!

• Parameters of this distribution


n > 0, 0 < p < 1

• Mean
E[X ] = np

• Variance
V[X ] = np(1 − p)

c Xinwei Ma 2021 6/30


Random Variables, Expectation, and Variance

 EXAMPLE. Let Y denote the number of “heads” that occur when three fair coins are
tossed.

1. Derive the probability distribution of Y .

2. Derive the mean and variance of Y .

 Solution. The random variable, Y , follows the distribution: Binomial(3, 21 ). Then we can
apply the formula to find its distribution (PMF). For example, the probability of
observing two “heads” is
3  1 2  1 3−2

3
P[Y = 2] = 1− = .
2 2 2 8

The mean and variance can also be found by employing the corresponding formulas:
1 3
E[Y ] = 3 · =
2 2
 
1 1 3
V[Y ] = 3 · cot 1 − = .
2 2 4

3
p
As a by-product, the standard deviation is V[Y ] = 2
= 0.866.

c Xinwei Ma 2021 7/30


Random Variables, Expectation, and Variance
 A continuous random variable takes values in a continuum/interval.

• We generally use the probability density function (PDF) to characterize its distribution
 
δ δ
fX (x)δ ≈ P x − ≤ X ≤ x + .
2 2
More precisely, for an interval [a, b],
Z b
P[a ≤ X ≤ b] = fX (x)dx.
a
0.5
0.4
0.3
0.2
0.1
0.0

−4 −2 0 2 4

• The PDF does not represent probabilities. The integral of the PDF does.
c Xinwei Ma 2021 8/30
Random Variables, Expectation, and Variance

 The mean (expectation) of a continuous random variable X is


Z
E[X ] = x · fX (x)dx.

• It is the weighted average of all possible values of X .

• It is the long-run average value of X over repeated realizations.

 EXAMPLE.

Given fX (x) = 1 − |x|, for |x| ≤ 1


Z 1 Z 0 Z 1
⇒ E[X ] = x(1 − |x|)dx = x(1 + x)dx + x(1 − x)dx = 0.
−1 −1 0

c Xinwei Ma 2021 9/30


Random Variables, Expectation, and Variance

 The variance of a continuous random variable X is


Z
V[X ] = (x − E[X ])2 · fX (x)dx.

• It is a measure of the squared spread of the distribution.

 A related concept is the standard deviation, which is


p
sd[X ] = V[X ].

• It is a measure of the spread of the distribution.

 EXAMPLE. (Recall E[X ] = 0)

Given fX (x) = 1 − |x|, for |x| ≤ 1


Z 1 Z 1
⇒ V[X ] = (x − E[X ])2 (1 − |x|)dx = x 2 (1 − |x|)dx
−1 −1
Z 1 1
⇒ V[X ] = 2 x 2 (1 − x)dx = ,
0 6

which also gives sd[X ] = 1/ 6.
c Xinwei Ma 2021 10/30
Random Variables, Expectation, and Variance

 Uniform distribution

• Domain (what values does X take)


[a, b]

• Probability density function (PDF)


1
fX (x) = for x ∈ [a, b]
b−a

• Parameters of this distribution


a and b

• Mean
a+b
E[X ] =
2

• Variance
(b − a)2
V[X ] =
12

c Xinwei Ma 2021 11/30


Normal (Gaussian) Distribution

 The normal distribution has a unique role in probability and statistics, due to the central
limit theorem (to be discussed later).

 A normal distribution has two parameters: the mean µ, and the variance σ 2

 Its PDF and CDF are given by


(x−µ)2 x (t−µ)2
Z
1 − 1 −
φµ,σ (x) = √ e 2σ 2 , Φµ,σ2 (x) = √ e 2σ 2 dt
2πσ 2 −∞ 2πσ 2
• We reserve the two notation, φµ,σ2 (x) and Φµ,σ2 (x), for the PDF and the CDF of the normal
distribution with mean µ and variance σ 2 .

 The standard normal distribution is obtained by setting µ = 0 and σ 2 = 1 (zero mean


and unit variance)
Z x
1 x2 1 t2
φ(x) = √ e − 2 , Φ(x) = √ e − 2 dt
2π −∞ 2π

• We reserve the two notation, φ(x) and Φ(x), for the PDF and the CDF of the standard normal
distribution.

 In practice, we use software (such as Stata, R, Matlab) or tabulated values to find values
of φµ,σ2 (x) and Φµ,σ2 (x).
c Xinwei Ma 2021 12/30
Random Variables, Expectation, and Variance

 PDF of the standard normal distribution

0.5
0.4
0.3
0.2
0.1
0.0

−4 −2 0 2 4

c Xinwei Ma 2021 13/30


Random Variables, Expectation, and Variance

 EXAMPLE. Let X follow the standard normal distribution. What is the probability
Table.P[X
Cumulative distribution function of the standard normal distribution P[X  x] =
≤ x] = Φ(x)? (x)

Note. This table can be used to calculate P[X  x] where X is a standard normal random variable. For example
x = Letthis
1.17, x be, say, −2.36.
probability Firstwhich
is 0.8790, go to the table
is the row labeled
entry for“−2.3,” and then
the row labeled go to
1.1 and thethe column
column labeled 7.
labeled “6.” The probability is P[X ≤ −2.36] = Φ(−2.36) = 0.0091.

Second Decimal Value of x


0 1 2 3 4 5 6 7 8 9
x
2.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233

c Xinwei Ma 2021
1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294 14/30
Random Variables, Expectation, and Variance

 EXAMPLE. Let X follow the standard normal distribution. What is the probability
P[X ≤ x] = Φ(x)?

 Let x be, say, 0.38. First go to the row labeled “0.3,” and then go to the column labeled
“8.” The probability is P[X ≤ 0.38] = Φ(0.38) = 0.6480.

Second Decimal Value of x


0 1 2 3 4 5 6 7 8 9
x
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
c Xinwei Ma 2021 15/30
Random Variables, Expectation, and Variance

 EXAMPLE. If Y is distributed N(1, 4), find P[Y ≤ 3].

 Solution. First of all, we cannot use the tabulated values directly, as they are for the
standard normal distribution. To proceed, consider
 
Y −1 3−1
P [Y ≤ 3] = P [Y − 1 ≤ 3 − 1] = P √ ≤ √ ,
4 4
where we subtract and
√divide by the same quantities on the two sides of the inequality.
Define X = (Y − 1)/ 4, then X follows the standard normal distribution. That is,

P [Y ≤ 3] = P [X ≤ 1] = Φ(1).

Finally, we can use the table to find the above probability, which is 0.8413.

(See the section, “Some Important Properties,” if you can unfamiliar with the above
reasoning.)

c Xinwei Ma 2021 16/30


Outline

Random Variables, Expectation, and Variance

Joint Distributin, Covaraince, and Correlation

Conditional Distribution and Conditional Expectation

Some Important Properties

c Xinwei Ma 2021 16/30


Joint Distributin, Covaraince, and Correlation

 With two or more random variables, we rely on their joint distribution to characterize
their properties.

• If both X and Y are discrete, the joint probability mass function (PMF) is
fX ,Y (x, y ) = P[X = x, Y = y ].

• If both X and Y are continuous, the joint probability density function (PDF) is
 
2 δ δ δ δ
fX ,Y (x, y )δ ≈ P x − ≤ X ≤ x + , y − ≤ Y ≤ y + .
2 2 2 2
More precisely, Z d Z b
P[a ≤ X ≤ b, c ≤ Y ≤ d] = fX ,Y (x, y )dxdy .
c a

c Xinwei Ma 2021 17/30


Joint Distributin, Covaraince, and Correlation

 The covariance between X and Y is


h i
Cov[X , Y ] = E (X − E[X ])(Y − E[Y ]) .

• It is a measure of the linear association between X and Y .

• EXAMPLE. Assume X follows the standard normal distribution. Clearly X and X 2 are related,
but Cov[X , X 2 ] = 0.

 A related concept is the correlation between X and Y

Cov[X , Y ]
Corr[X , Y ] = p p .
V[X ] V[Y ]

• Correlation is always between -1 and 1.

• Correlation is unit-free (i.e., it does not depend on how X and Y are measured).

• EXAMPLE. Changing the unit of X from meters to centimeters will make the covariance 100
times larger, but leaves the correlation unchanged.

c Xinwei Ma 2021 18/30


Joint Distributin, Covaraince, and Correlation
 EXAMPLE. Class size and test score
• Test scores and class sizes in 1990 in 420 California school districts that serve kindergarten
through eighth grade.
• Test score: districtwise average of reading and math scores for fifth graders.
• Class size: districtwise student-to-teacher ratio.

 Sample correlation between testScore and stuTeacherRatio is −0.2263.

 Scatter plot suggests negative relationship between testScore and stuTeacherRatio.


720
700
680
660
640
620
600

10 15 20 25 30
Student-Teacher Ratio

Average Test Score Fitted values

c Xinwei Ma 2021 19/30


Joint Distributin, Covaraince, and Correlation

 Positive, negative, and zero correlation: larger values of X are associated with · · · values
of Y .

c Xinwei Ma 2021 20/30


Outline

Random Variables, Expectation, and Variance

Joint Distributin, Covaraince, and Correlation

Conditional Distribution and Conditional Expectation

Some Important Properties

c Xinwei Ma 2021 20/30


Conditional Distribution and Conditional Expectation

 Given the joint distribution of two random variables, we can also find the conditional
distribution of one variable after fixing the value of the other.

 If both X and Y are discrete, the conditional distribution of Y given X = x is


summarized in the conditional PMF
P[Y = y , X = x]
fY |X =x (y |x) = P[Y = y |X = x] = .
P[X = x]

Similarly, the conditional PMF of X given Y = y is

P[X = x, Y = y ]
fX |Y =y (x|y ) = P[X = x|Y = y ] = .
P[Y = y ]

 We can also compute the conditional expectation and the conditional variance as
X X
E[Y |X = x] = y · P[Y = y |X = x] = y · fY |X =x (y |x),
y y

and
X X
V[Y |X = x] = (y −E[Y |X = x])2 ·P[Y = y |X = x] = (y −E[Y |X = x])2 ·fY |X =x (y |x).
y y

c Xinwei Ma 2021 21/30


Conditional Distribution and Conditional Expectation

 EXAMPLE. Conditional distribution and conditional expectation

X =0 X =1 X =3 Total
Y =1 0.16 0.06 0.22 0.44
Y =2 0.13 0.01 0.14 0.28
Y =5 0.06 0.19 0.03 0.28
Total 0.35 0.26 0.39 1.00

• Find P[X = 1] and P[Y = 5].

• Find P[Y = 2|X = 3].


P[Y = 2, X = 3] 0.14
P[Y = 2|X = 3] = = .
P[X = 3] 0.39

c Xinwei Ma 2021 22/30


Conditional Distribution and Conditional Expectation

 EXAMPLE. Conditional distribution and conditional expectation

X =0 X =1 X =3 Total
Y =1 0.16 0.06 0.22 0.44
Y =2 0.13 0.01 0.14 0.28
Y =5 0.06 0.19 0.03 0.28
Total 0.35 0.26 0.39 1.00

• Find E[Y |X = 0], E[Y |X = 1], and E[Y |X = 3].


0.16 0.13 0.06
E[Y |X = 0] = 1 × +2× +5× = 2.06
0.35 0.35 0.35
0.06 0.01 0.19
E[Y |X = 1] = 1 × +2× +5× = 3.96
0.26 0.26 0.26
0.22 0.14 0.03
E[Y |X = 3] = 1 × +2× +5× = 1.67.
0.39 0.39 0.39

c Xinwei Ma 2021 23/30


Conditional Distribution and Conditional Expectation

 We usually treat conditional expectations as random variables. The reason is that the
value of the conditional expectation, E[Y |X ], depends on the value of X , where the latter
is random.

 EXAMPLE. Conditional distribution and conditional expectation

X =0 X =1 X =3 Total
Y =1 0.16 0.06 0.22 0.44
Y =2 0.13 0.01 0.14 0.28
Y =5 0.06 0.19 0.03 0.28
Total 0.35 0.26 0.39 1.00
We already computed the values

E[Y |X = 0] = 2.06 E[Y |X = 1] = 3.96 E[Y |X = 3] = 1.67,

which means E[Y |X ] takes three different values, with probabilities




 2.06 with probability P[X = 0] = 0.35

E[Y |X ] = 3.96 with probability P[X = 1] = 0.26


1.67 with probability P[X = 3] = 0.39.

c Xinwei Ma 2021 24/30


Conditional Distribution and Conditional Expectation

 If both X and Y are continuous, the conditional distribution of Y given X = x is


summarized in the conditional PDF
fX ,Y (x, y )
fY |X =x (y |x) = .
fX (x)

Similarly, the conditional PMF of X given Y = y is

fX ,Y (x, y )
fX |Y =y (x|y ) = .
fY (y )

 We can also compute the conditional expectation and the conditional variance as
Z
E[Y |X = x] = yfY |X =x (y |x)dy .

and Z
V[Y |X = x] = (y − E[Y |X = x])2 fY |X =x (y |x)dy .

c Xinwei Ma 2021 25/30


Outline

Random Variables, Expectation, and Variance

Joint Distributin, Covaraince, and Correlation

Conditional Distribution and Conditional Expectation

Some Important Properties

c Xinwei Ma 2021 25/30


Some Important Properties

 Important properties of expectation

• Expectation is linear
E[aX + bY ] = a · E[X ] + b · E[Y ].

• Under independence, expectation of the product is the product of expectations


X and Y independent ⇒ E[XY ] = E[X ] · E[Y ].

c Xinwei Ma 2021 26/30


Some Important Properties

 Important properties of variance and standard deviation

• Variance is homogeneous of degree 2


2
V[aX + b] = a · V[X ].

• Standard deviation is homogeneous of degree 1


q
sd[aX + b] = V[aX + b] = a · sd[X ].

• Under independence, variance of the sum is the sum of variances


X and Y independent ⇒ V[X + Y ] = V[X ] + V[Y ].

• Important application: variance of the sum of n identically and independently distributed (iid)
variables " n #
X
X1 , X2 , · · · , Xn iid ⇒ V Xi = n · V[X ].
i=1

• Important application: variance of the average of n iid variables


" n
#
1X h i V[X ]
X1 , X2 , · · · , Xn iid ⇒ V Xi = V X = .
n i=1 n

c Xinwei Ma 2021 27/30


Some Important Properties

 Important properties of covariance and correlation

• The covariance/correlation between a random variable and a number is always zero


Cov[X , a] = 0, Corr[X , a] = 0.

• Covariance is homogeneous of degree 1


Cov[aX + b, cY + d] = a · c · Cov[X , Y ].

• Correlation is homogeneous of degree 0


Corr[aX + b, cY + d] = Corr[X , Y ].
Implication: correlation is robust to the unit of measurement.

• Independence implies zero covariance/correlation


X and Y independent ⇒ Cov[X , Y ] = 0, Corr[X , Y ] = 0.

c Xinwei Ma 2021 28/30


Some Important Properties

 Important properties of variance and covariance

• Variance and covariance decomposition


2 2
V[X ] = E[X ] − (E[X ]) , Cov[X , Y ] = E[XY ] − E[X ]E[Y ].

• The covariance between X and itself is the variance of X .


Cov[X , X ] = V[X ].

c Xinwei Ma 2021 29/30


Some Important Properties

 Assume we have three random variables, X , Y and Z . The law of iterated expectation
says  
E E[X |Y , Z ] Z = E[X |Z ]

• Intuition. The inner expectation, E[X |Y , Z ], calculates the average value of X for each slice of
(Y , Z ). The outer expectation further averages across different values of Y .

 More generally, we have


 
E E[X |“some other stuff”, Z ] Z = E[X |Z ]

 A special case: E[E[X |Y ]] = E[X ].

 Implication. Constant conditional mean implies zero correlation/covariance.


Cov[X , Y ] = E[XY ] − E[X ]E[Y ] property of covariance

= E[E[XY |Y ]] − E[X ]E[Y ] from the above special case

= E[E[X |Y ] · Y ] − E[X ]E[Y ] Y is “known” after conditioning on Y

= E[E[X ] · Y ] − E[X ]E[Y ] E[X |Y ] = E[X ]

= E[X ]E[Y ] − E[X ]E[Y ] = 0.

c Xinwei Ma 2021 30/30


The lectures and course materials, including slides, tests, outlines, and similar
materials, are protected by U.S. copyright law and by University policy. You may take
notes and make copies of course materials for your own use. You may also share those
materials with another student who is enrolled in or auditing this course.

You may not reproduce, distribute or display (post/upload) lecture notes or


recordings or course materials in any other way – whether or not a fee is charged –
without my written consent. You also may not allow others to do so.

If you do so, you may be subject to student conduct proceedings under the UC San
Diego Student Code of Conduct.

c Xinwei Ma 2021
x1ma@ucsd.edu

c Xinwei Ma 2021 30/30

You might also like