San José State University
Math 261A: Regression Theory & Methods
Generalized Linear Models (GLMs)
Dr. Guangliang Chen
This lecture is based on the following textbook sections:
• Chapter 13: 13.1 – 13.3
Outline of this presentation:
• What is a GLM?
• Logistic regression
• Poisson regression
Generalized Linear Models (GLMs)
What is a GLM?
In ordinary linear regression, we assume that the response is a linear
function of the regressors plus Gaussian noise:
y = β0 + β1 x1 + · · · + βk xk +
|{z} ∼ N (x0 β, σ 2 )
| {z }
linear form x0 β N (0,σ 2 ) noise
The model can be reformulate in terms of
• distribution of the response: y | x ∼ N (µ, σ 2 ), and
• dependence of the mean on the predictors: µ = E(y | x) = x0 β
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 3/24
Generalized Linear Models (GLMs)
beta=(1,2)
5
4
3
β0 + β1 x b
y
2
y
1
0
−1
0.0 0.2 0.4 0.6 0.8 1.0
x x
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 4/24
Generalized Linear Models (GLMs)
Generalized linear models (GLM) extend linear regression by allowing
the response variable to have
• a general distribution (with mean µ = E(y | x)) and
• a mean that depends on the predictors through a link function g:
That is,
g(µ) = β 0 x
or equivalently,
µ = g −1 (β 0 x)
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 5/24
Generalized Linear Models (GLMs)
In GLM, the response is typically assumed to have a distribution in the
exponential family, which is a large class of probability distributions that
have pdfs of the form f (x | θ) = a(x)b(θ) exp(c(θ) · T (x)), including
• Normal - ordinary linear regression
• Bernoulli - Logistic regression, modeling binary data
• Binomial - Multinomial logistic regression, modeling general cate-
gorical data
• Poisson - Poisson regression, modeling count data
• Exponential, Gamma - survival analysis
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 6/24
Generalized Linear Models (GLMs)
In theory, any combination of the response distribution and link function
(that relates the mean response to a linear combination of the predictors)
specifies a generalized linear model.
Some combinations turn out to be much more useful and mathematically
more tractable than others in practice.
Response distribution Link function g(µ) Use
Normal Identity µ OLS
µ
Bernoulli Logit log 1−µ Logistic regression
Poisson Log log(µ) Poisson regression
Exponential/Gamma Inverse −1/µ Survival analysis
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 7/24
Generalized Linear Models (GLMs)
Applications:
• Logistic Regression: Predict the likelihood that a consumer of an
online shopping website will buy a specific item (say, a camera)
within the next month based on the consumer’s purchase history.
• Poisson regression: Modeling the number of children a couple has
as a function of their ages, numbers of siblings, income, education
levels, etc.
• Exponential: Modeling the survival time (time until death) of
patients in a clinical study as a function of disease, age, gender, type
of treatment etc.
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 8/24
Generalized Linear Models (GLMs)
Logistic regression
Logistic regression is a GLM that combines the Bernoulli distribution (for
the response) and the logit link function (relating the mean response to
predictors):
µ
log = β0 x (y ∼ Bernoulli(p))
1−µ
Remark. Since µ = E(y | x) = p, we have
p
log = β0 x (y ∼ Bernoulli(p))
1−p
p p
where p: probability of success, 1−p : odds, log( 1−p ): log-odds.
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 9/24
Generalized Linear Models (GLMs)
Solving for µ (and also p), we obtain that
1 1
µ= = σ(β 0 x), s(z) = ,
1 + e−β0 x 1 + e−z
where s(·) is the sigmoid function, also called the logistic function.
Properties of the sigmoid function:
1.0
0.8
• s(0) = 0.5
0.6
• 0 < s(z) < 1 for all z
mu
0.4
• s(z) monotonically increases
0.2
as z goes from −∞ to +∞
0.0
−4 −2 0 2 4
z
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 10/24
Generalized Linear Models (GLMs)
For fixed β (model parameter) and
1.0
each given x (sampled location),
0.8
µ = p = s(z), z = β0 x
0.6
mu
0.4
has the following interpretations:
0.2
• mean response
0.0
−4 −2 0 2 4
z
E(y | x, β) = s(z)
Population model:
• probability of success:
y | x, β ∼ Bernoulli(p = s(β 0 x))
P (y = 1 | x, β) = s(z)
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 11/24
Generalized Linear Models (GLMs)
A sample from the logistic regression model, with p = s(−3 + 2x)
beta=(−3,2)
1.0
0.8
0.6
y
0.4
0.2
0.0
0 1 2 3 4
x
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 12/24
Generalized Linear Models (GLMs)
Parameter estimation via MLE
Given a data set (x1 , y1 ), . . . , (xn , yn ),
1.0
fitting a logistic regression model is
0.8
equivalent to choosing the value of
0.6
β such that the mean response
y
0.4
0.2
µ = s(β 0 x)
0.0 0 1 2 3 4
x
matches the sample as “closely” as
possible.
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 13/24
Generalized Linear Models (GLMs)
Mathematically, the best β is usually found by maximizing the likelihood
of the sample:
n
Y
L(β | y1 , . . . , yn ) = f (y1 , . . . , yn | β) = f (yi | β)
i=1
where f (yi | β) is the probability function of the ith observation:
p , yi = 1
i
f (yi | β) = pyi i (1 − pi )1−yi =
1 − p yi = 0
i
and
1
pi =
1 + e−β0 xi
However, there is no closed-form solution, and the optimal β has to be
computed numerically.
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 14/24
Generalized Linear Models (GLMs)
Prediction by logistic regression
Once the optimal parameter β̂ is found, the mean response at a new
location x0 is
1
E(y | x0 , β̂) =
1 + e−β̂0 x0
Note that this would not be our exact prediction at x0 (why?).
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 15/24
Generalized Linear Models (GLMs)
To make a prediction at x0 based on the estimates β̂, consider
1
y0 | x0 , β̂ ∼ Bernoulli(p̂0 ), p̂0 = .
1 + e−β̂0 x0
The prediction at x0 is
1, if p̂0 > 0.5
ŷ0 =
0, if p̂0 < 0.5
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 16/24
Generalized Linear Models (GLMs)
R scripts
x = c(162, 165, 166, 170, 171, 168, 171, 175, 176, 182, 185)
y = c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
model ← glm(y∼x,family=binomial(link=’logit’))
p = model$fitted.values
# p = [0.0168, 0.0708, 0.1114, 0.4795, 0.6026, 0.2537, 0.6026, 0.9176,
0.9483, 0.9973, 0.9994]
beta = model$coefficients # beta = [-84.8331094 0.4985354]
fitted.prob ← predict(model,data.frame(x=c(168,170,173)),type=’response’)
# fitted.prob = [0.2537, 0.4795 0.8043 ]
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 17/24
Generalized Linear Models (GLMs)
p=1/(1+exp(−84.8331+0.4985 x))
1.0
0.8
0.6
p
0.4
0.2
0.0
160 165 170 175 180 185 190
x
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 18/24
Generalized Linear Models (GLMs)
Other models for binary response data
Instead of using the logit link function,
1
p=
1 + e−β0 x
to force the estimated probabilities to lie between 0 and 1:
y | x, β ∼ Bernoulli(p)
one could use
• Probit: p = Φ(β 0 x), where Φ is the cdf of standard normal.
• Complimentary log-log: p = 1 − exp(− exp(β 0 x))
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 19/24
Generalized Linear Models (GLMs)
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 20/24
Generalized Linear Models (GLMs)
Poisson regression
Poisson regression is a GLM that combines the Poisson distribution (for the
response) and the log link function (relating mean response to predictors):
log (µ) = β 0 x (y ∼ Poisson(λ))
Remark. Since µ = E(y | x) = λ, we have
0
log λ = β 0 x or λ = eβ x
That is,
0
y | x, β ∼ Poisson(λ = eβ x )
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 21/24
Generalized Linear Models (GLMs)
beta=(1,−3)
80
sample
true model
fitted model
60
40
y
20
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
x
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 22/24
Generalized Linear Models (GLMs)
R code
poisson.model ← glm(y∼x,family=poisson(link=’log’))
poisson.model$coefficients
(Intercept) x
1.003291 -3.019297
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 23/24
Generalized Linear Models (GLMs)
Summary and beyond
We talked about the concept of generalized linear models and its two
special instances:
• Logistic regression: logit link function + Bernoulli distribution
• Poisson regression: log link function + Poisson distribution
Note that parameter estimation for GLM is through MLE; prediction is
based on the mean (plus some necessary adjustments).
Further learning on logistic and multinomial regression:
http://www.sjsu.edu/faculty/guangliang.chen/Math251F18/lec5logistic.pdf
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 24/24