Outline
1. Some basic ideas
2. The problem of estimation: OLS method
3. Interval estimation and hypothesis testing
4. Extensions of the two variable linear regression
model
10/24/2017 Mai VU-FIE-FTU 2
1. Some basic ideas
A hypothetical example
The population regression function (PRF)
The meaning of the term linear
Stochastic specification of PRF
The significance of the stochastic disturbance term
The sample regression function (SRF)
10/24/2017 Mai VU-FIE-FTU 3
1.1. A hypothetical example
Consider a total population of 60 families in a
hypothetical community and their weekly income (X)
and weekly consumption expenditure (Y), both in
dollars
The 60 families are divided into 10 income groups
(from $80 to $260) and the weekly expenditures of
each family in the various groups are as shown in the
table.
10/24/2017 Mai VU-FIE-FTU 4
Therefore, we have 10 fixed values of X and the corresponding Y values
against each of the X values; so to speak, there are 10 Y subpopulations.
10/24/2017 Mai VU-FIE-FTU 5
There is considerable variation in weekly consumption
expenditure in each income group but on the average, weekly
consumption expenditure increases as income increases
10/24/2017 Mai VU-FIE-FTU 6
1.1. A hypothetical example
We have given the mean, or average, weekly
consumption expenditure corresponding to each of
the 10 levels of income
In all we have 10 mean values for the 10 subpopulations
of Y.
We call these mean values conditional expected
values, as they depend on the given values of the
(conditioning) variable X.
Symbolically, we denote them as E(Y | X), which is
read as the expected value of Y given the value of X
10/24/2017 Mai VU-FIE-FTU 7
1.1. A hypothetical example
10/24/2017 Mai VU-FIE-FTU 8
Conditional expected values v.s unconditional
expected values
What is the difference between the 2 below questions:
“What is the expected value of weekly consumption
expenditure of a family?”
“What is the expected value of weekly consumption
expenditure of a family whose monthly income is, say,
$140?”
10/24/2017 Mai VU-FIE-FTU 9
1.1. A hypothetical example
The dark circled points
show the conditional mean
values of Y against the
various X values.
If we join these conditional
mean values, we obtain
what is known as the
population regression line
(PRL), or more generally,
the population regression
curve.
More simply, it is the
regression of Y on X.
10/24/2017 Mai VU-FIE-FTU 10
1.1. A hypothetical example
Geometrically, then, a population regression curve is
simply the locus of the conditional means of the
dependent variable for the fixed values of the
explanatory variable(s).
More simply, it is the curve connecting the means of
the subpopulations of Y corresponding to the given
values of the regressor X.
10/24/2017 Mai VU-FIE-FTU 11
1.2. The population regression function
(PRF)
It is clear that each conditional mean E(Y|Xi) is a function
of Xi, where Xi is a given value of X. Symbolically:
E(Y|Xi)= f(Xi) (1)
where f(Xi) denotes some function of the explanatory
variable X
Equation (1) is known as the conditional expectation
function (CEF) or population regression function
(PRF).
It states merely that the expected value of the distribution
of Y given Xi is functionally related to Xi → It tells how the
mean or average response of Y varies with X.
10/24/2017 Mai VU-FIE-FTU 12
1.2. The population regression function
(PRF)
We may assume that the PRF: E(Y|Xi) is a linear function
of Xi, say, of the type:
E(Y|Xi)=β1+ β2Xi (2)
where β1 and β2 are unknown but fixed parameters known as
the regression coefficients; β1 and β2 are also known as
intercept and slope coefficients, respectively.
Equation (2) is known as the linear population
regression function.
Some alternative expressions: linear population regression
model, linear population regression, regression, regression
equation
10/24/2017 Mai VU-FIE-FTU 13
1.3. The meaning of the term linear
The term “linear” regression will always mean a
regression that is linear in the parameters; the β’s (that
is, the parameters are raised to the first power only). It
may or may not be linear in the explanatory variables,
the X’s.
Linearity in the Variables
Linearity in the Parameters
10/24/2017 Mai VU-FIE-FTU 14
Linearity in the Variables
That is “the conditional expectation of Y is a linear
function of Xi”
Geometrically, the regression curve in this case is a
straight line.
Example:
Y = 1 + (1 − 2 ) 2 X + ui
→ A linear (in the variable) but an nonlinear (in the
parameter) regression model
10/24/2017 Mai VU-FIE-FTU 15
Linearity in the Parameters
That is “the conditional expectation of Y, E(Y|Xi), is a
linear function of the parameters, the β’s; it may or
may not be linear in the variable X
Example:
1
Y = 1 + 2 + ui
X
→ A linear ( in the parameter) but a nonlinear (in the
variable) regression model.
10/24/2017 Mai VU-FIE-FTU 16
1.4. STOCHASTIC SPECIFICATION OF PRF
It is clear that, as family income increases, family
consumption expenditure on the average increases, too.
But what about the consumption expenditure of an
individual family in relation to its (fixed) level of income?
It is obvious that an individual family’s consumption
expenditure does not necessarily increase as the income
level increases.
For example, corresponding to the income level of $100
there is one family whose consumption expenditure of $65
is less than the consumption expenditures of two families
whose weekly income is only $80.
10/24/2017 Mai VU-FIE-FTU 17
1.4. STOCHASTIC SPECIFICATION OF PRF
But notice that the average consumption expenditure
of families with a weekly income of $100 is greater than
the average consumption expenditure of families with
aw
Then, given the income level of Xi, an individual
family’s consumption expenditure is clustered around
the average consumption of all families at that Xi, that
is, around its conditional expectation.
10/24/2017 Mai VU-FIE-FTU 18
1.4. STOCHASTIC SPECIFICATION OF PRF
Therefore, we can express the deviation of an
individual Yi around its expected value as follows:
ui = Yi − E(Y | Xi)
or
Yi = E(Y | Xi) + ui (3)
where the deviation ui is an unobservable random
variable taking positive or negative values. Technically,
ui is known as the stochastic disturbance or stochastic
error term.
10/24/2017 Mai VU-FIE-FTU 19
1.4. STOCHASTIC SPECIFICATION OF PRF
How do we interpret (3)? We can say that the
expenditure of an individual family, given its income
level, can be expressed as the sum of two components:
(1) E(Y | Xi): the mean consumption expenditure of all
the families with the same level of income. This
component is known as the systematic, or
deterministic component
(2) ui: the random, or nonsystematic component
10/24/2017 Mai VU-FIE-FTU 20
1.4. STOCHASTIC SPECIFICATION OF PRF
If E(Y | Xi) is assumed to be linear in Xi, Eq. (3) may be
written as
Yi = E(Y | Xi) + ui
= β1 + β2Xi + ui (4)
Eq. (4) posits that the consumption expenditure of a
family is linearly related to its income plus the
disturbance term
10/24/2017 Mai VU-FIE-FTU 21
1.4. STOCHASTIC SPECIFICATION OF PRF
Yi = E (Y | X i ) + ui
= 1 + 2 X i + ui
Systematic/ Nonsystematic
deterministic component
component
10/24/2017 Mai VU-FIE-FTU 22
1.4. STOCHASTIC SPECIFICATION OF PRF
Thus, the individual consumption expenditures, given
X = $80, can be expressed as:
Y1= 55= β1 + β2 (80)+ u1
Y1= 60= β1 + β2 (80)+ u2
Y1= 65= β1 + β2 (80)+ u3
Y1= 70= β1 + β2 (80)+ u4
Y1= 75= β1 + β2 (80)+ u5
10/24/2017 Mai VU-FIE-FTU 23
1.4. STOCHASTIC SPECIFICATION OF PRF
Now if we take the expected value of (3) on both sides, we
obtain:
E(Yi | Xi) = E[E(Y | Xi)] + E(ui | Xi)
= E(Y | Xi) + E(ui | Xi) (5)
Since E(Yi |Xi) is the same thing as E(Y|Xi), Eq. (5) implies
that:
E(ui | Xi) = 0 (6)
Thus, the assumption that the regression line passes
through the conditional means of Y implies that the
conditional mean values of ui (conditional upon the given
X’s) are zero.
10/24/2017 Mai VU-FIE-FTU 24
1.4. STOCHASTIC SPECIFICATION OF PRF
It also clearly shows that there are other variables
besides income that affect consumption expenditure
and that an individual family’s consumption
expenditure cannot be fully explained only by the
variable(s) included in the regression model.
10/24/2017 Mai VU-FIE-FTU 25
1.5. THE SIGNIFICANCE OF THE STOCHASTIC
DISTURBANCE TERM
The disturbance term ui is a surrogate for all those
variables that are omitted from the model but that
collectively affect Y.
Question: Why ui?
✓ Ignorance about other variables affecting Y
✓ Unavailability of data
✓ Randomness in human behavior
✓ We would like to keep our regression model as simple as
possible
✓ Wrong functional form: missing important variables
10/24/2017 Mai VU-FIE-FTU 26
1.6. The sample regression function
(SRF)
Population vs. Sample
Why do we have to use sample regression function?
Population Regression Model vs. Sample Regression
Model
10/24/2017 Mai VU-FIE-FTU 27
POPULATION VS. SAMPLE
Population is the set of entities under study.
Sample is a subset of the population.
Example: Consider the income of all households in
Hanoi.
✓ Population: all households in Hanoi
✓ Sample: households in Dong Da district
10/24/2017 Mai VU-FIE-FTU 28
POPULATION VS. SAMPLE
Questions of interest:
✓ What is the average income of households in
Hanoi?
✓ How scattered the income is among the
households?
We often want to know about the pop. Parameters
✓ Collect the information from all population of Hanoi
households => calculate the mean and the variance =>
correct answers, or:
✓ Collect a sample and get an “approximation”
10/24/2017 Mai VU-FIE-FTU 29
Population Regression Function vs.
Sample Regression Function
Why do we have to use sample regression function?
Difficulties in data collection
To estimate the population regression function
10/24/2017 Mai VU-FIE-FTU 30
1.6. The sample regression function (SRF)
Pretend that the population of Table 1 was not known
to us and the only information we had was a randomly
selected sample of Y values for the fixed X’s as given in
Table 4 and 5.
The question is: From the sample of Table 4 and 5 can
we predict the average weekly consumption
expenditure Y in the population as a whole
corresponding to the chosen X’s? In other words, can
we estimate the PRF from the sample data?
10/24/2017 Mai VU-FIE-FTU 31
1.6. The sample regression function (SRF)
Table 4. A random sample from Table 5. Another random sample from
the Population of table 1 The population of table 1
Y X Y X
70 80 55 80
65 100 88 100
90 120 90 120
95 140 80 140
110 160 118 160
115 180 120 180
120 200 145 200
140 220 135 220
155 240 145 240
150 260 175 260
10/24/2017 Mai VU-FIE-FTU 32
Plotting the data of Tables 4 and 5, we obtain the scattergram given in
Figure 4. In the scattergram two sample regression lines are drawn so as to
“fit” the scatters reasonably well: SRF1 is based on the first sample, and
SRF2 is based on the second sample.
10/24/2017 Mai VU-FIE-FTU 33
1.6. The sample regression function (SRF)
We can develop the concept of the sample regression
function (SRF) to represent the sample regression line.
It may be written as:
Yˆ = ˆ + ˆ X
i 1 2
(7)
i
Where: β 1 : estimator of β1
β 2 : estimator of β2
𝑌𝑖 (Y-hat or Y-cap): estimator of E(Y|Xi)
10/24/2017 Mai VU-FIE-FTU 34
1.6. The sample regression function (SRF)
Now just as we expressed the PRF in two equivalent
forms, we can express the SRF in its stochastic form as
follows:
Y = ˆ + ˆ X + u i (8)
i 1 2 i
where, in addition to the symbols already defined, 𝑢
ො𝑖
denotes the (sample) residual term.
10/24/2017 Mai VU-FIE-FTU 35
1.6. The sample regression function (SRF)
To sum up, then, we find our primary objective in
regression analysis is to estimate the PRF on the basis
of the SRF because more often than not our analysis is
based upon a single sample from some population.
But because of sampling fluctuations our estimate of
the PRF based on the SRF is at best an approximate
one.
This approximation is shown diagrammatically in
Figure 5
10/24/2017 Mai VU-FIE-FTU 36
PRF v.s. SRF
10/24/2017 Mai VU-FIE-FTU 37
PRF v.s. SRF
For X = Xi, we have one (sample) observation Y = Yi. In
terms of the SRF, the observed Yi can be expressed as:
𝑌𝑖 = 𝑌𝑖 + 𝑢ො 𝑖
and in terms of the PRF, it can be expressed as:
𝑌𝑖 = 𝐸 𝑌 𝑋𝑖 + 𝑢𝑖
Obviously 𝑌𝑖 overestimates the true E(Y |Xi) for the Xi
shown therein. By the same token, for any Xi to the left
of the point A, the SRF will underestimate the true
PRF.
10/24/2017 Mai VU-FIE-FTU 38
PRF v.s. SRF
Question: can we devise a rule or a method that will
make this approximation as “close” as possible?
In other words, how should the SRF be constructed so
that β 1 is as “close” as possible to the true β1 and β 2 is
as “close” as possible to the true β2 even though we will
never know the true β1 and β2?
The answer to this question will occupy much of our
attention in the next part.
We note here that we can develop procedures that tell
us how to construct the SRF to mirror the PRF as
faithfully as possible.
10/24/2017 Mai VU-FIE-FTU 39
Population Regression Model vs.
Sample Regression Model
Population regression function (PRF):
✓ a rule or principle that shows the relationship between
the dependent variable and the independent variables
for the whole population of interest.
✓ Drawn using all population observations.
Sample regression function (SRF):
✓ a rule or principle that shows the relationship between
the dependent variable and the independent variables in
a specific sample.
✓ Drawn using observations from a specific sample.
10/24/2017 Mai VU-FIE-FTU 40
Exercises
1. What is the conditional expectation function or the
population regression function?
2. What is the difference between the population and
sample regression functions? Is this a distinction without
difference?
3. What is the role of the stochastic error term ui in
regression analysis? What is the difference between the
stochastic error term and the residual, 𝑢ෝ𝑖 ?
4. Why do we need regression analysis? Why not simply use
the mean value of the regressand as its best value?
5. What do we mean by a linear regression model?
10/24/2017 Mai VU-FIE-FTU 41
Exercises
6. Determine whether the following models are linear
in the parameters, or the variables, or both. Which of
these models are linear regression models?
Model Descriptive title
𝑌𝑖 = 𝛽1 + 𝛽2 (1/𝑋𝑖 )+𝑢𝑖 Reciprocal
𝑌𝑖 = 𝛽1 + 𝛽2 ln𝑋𝑖 + 𝑢𝑖 Semilogarithmic
𝑙𝑛𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 Inverse semilogarithmic
𝑙𝑛𝑌𝑖 = 𝑙𝑛𝛽1 + 𝛽2 ln𝑋𝑖 + 𝑢𝑖 Logarithmic or double
logarithic
ln𝑌𝑖 = 𝛽1 − 𝛽2 (1/𝑋𝑖 )+𝑢𝑖 Logarithmic reciprocal
10/24/2017 Mai VU-FIE-FTU 42
Exercises
7. Are the following models linear regression models?
Why or why not?
𝒀𝒊 = 𝒆𝜷𝟏 +𝜷𝟐 𝑿𝒊+𝒖𝒊
𝒀𝒊 = 𝟏/(𝟏 + 𝒆𝜷𝟏 +𝜷𝟐 𝑿𝒊+𝒖𝒊 )
𝒍𝒏𝒀𝒊 = 𝜷𝟏 − 𝜷𝟐 (1/𝑿𝒊 )+𝒖𝒊
𝒀𝒊 = 𝜷𝟏 + 𝟎. 𝟕𝟓 − 𝜷𝟏 𝒆−𝜷𝟐 𝑿𝒊 −𝟐
+ 𝒖𝒊
𝒀𝒊 = 𝜷𝟏 + 𝜷𝟑𝟐 𝑿𝒊 + 𝒖𝒊
10/24/2017 Mai VU-FIE-FTU 43