0% found this document useful (0 votes)

89 views24 pages

Ch13slides Generalized Linear Models

1) The document discusses generalized linear models (GLMs), which extend linear regression by allowing the response variable to have a general distribution from the exponential family and linking the mean response to predictors through link functions. 2) Logistic regression is presented as a specific GLM where the response has a Bernoulli distribution and the logit link function is used. This models the probability of an event as a function of predictors. 3) Parameter estimation in GLMs is typically done by maximum likelihood, fitting the model parameters to maximize the likelihood of observing the sample data.

Uploaded by

Reetika Choudhury

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views24 pages

Ch13slides Generalized Linear Models

Uploaded by

Reetika Choudhury

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

San José State University

Math 261A: Regression Theory & Methods

Generalized Linear Models (GLMs)

Dr. Guangliang Chen

This lecture is based on the following textbook sections:

• Chapter 13: 13.1 – 13.3

Outline of this presentation:

• What is a GLM?

• Logistic regression

• Poisson regression
Generalized Linear Models (GLMs)

What is a GLM?
In ordinary linear regression, we assume that the response is a linear
function of the regressors plus Gaussian noise:

y = β0 + β1 x1 + · · · + βk xk +
|{z} ∼ N (x0 β, σ 2 )
| {z }
linear form x0 β N (0,σ 2 ) noise

The model can be reformulate in terms of

• distribution of the response: y | x ∼ N (µ, σ 2 ), and

• dependence of the mean on the predictors: µ = E(y | x) = x0 β

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 3/24
Generalized Linear Models (GLMs)

beta=(1,2)

5
4
3
β0 + β1 x b

y
2
y

1
0
−1
0.0 0.2 0.4 0.6 0.8 1.0
x x

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 4/24
Generalized Linear Models (GLMs)

Generalized linear models (GLM) extend linear regression by allowing

the response variable to have

• a general distribution (with mean µ = E(y | x)) and

• a mean that depends on the predictors through a link function g:

That is,
g(µ) = β 0 x
or equivalently,
µ = g −1 (β 0 x)

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 5/24
Generalized Linear Models (GLMs)

In GLM, the response is typically assumed to have a distribution in the

exponential family, which is a large class of probability distributions that
have pdfs of the form f (x | θ) = a(x)b(θ) exp(c(θ) · T (x)), including

• Normal - ordinary linear regression

• Bernoulli - Logistic regression, modeling binary data

• Binomial - Multinomial logistic regression, modeling general cate-

gorical data

• Poisson - Poisson regression, modeling count data

• Exponential, Gamma - survival analysis

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 6/24
Generalized Linear Models (GLMs)

In theory, any combination of the response distribution and link function

(that relates the mean response to a linear combination of the predictors)
specifies a generalized linear model.

Some combinations turn out to be much more useful and mathematically

more tractable than others in practice.

Response distribution Link function g(µ) Use

Normal Identity µ OLS
µ
Bernoulli Logit log 1−µ Logistic regression
Poisson Log log(µ) Poisson regression
Exponential/Gamma Inverse −1/µ Survival analysis

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 7/24
Generalized Linear Models (GLMs)

Applications:

• Logistic Regression: Predict the likelihood that a consumer of an

online shopping website will buy a specific item (say, a camera)
within the next month based on the consumer’s purchase history.

• Poisson regression: Modeling the number of children a couple has

as a function of their ages, numbers of siblings, income, education
levels, etc.

• Exponential: Modeling the survival time (time until death) of

patients in a clinical study as a function of disease, age, gender, type
of treatment etc.

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 8/24
Generalized Linear Models (GLMs)

Logistic regression
Logistic regression is a GLM that combines the Bernoulli distribution (for
the response) and the logit link function (relating the mean response to
predictors):

µ

log = β0 x (y ∼ Bernoulli(p))
1−µ

Remark. Since µ = E(y | x) = p, we have

p

log = β0 x (y ∼ Bernoulli(p))
1−p
p p
where p: probability of success, 1−p : odds, log( 1−p ): log-odds.

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 9/24
Generalized Linear Models (GLMs)

Solving for µ (and also p), we obtain that

1 1
µ= = σ(β 0 x), s(z) = ,
1 + e−β0 x 1 + e−z
where s(·) is the sigmoid function, also called the logistic function.

Properties of the sigmoid function:

1.0
0.8

• s(0) = 0.5
0.6

• 0 < s(z) < 1 for all z

mu
0.4

• s(z) monotonically increases

0.2

as z goes from −∞ to +∞
0.0

−4 −2 0 2 4
z

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 10/24
Generalized Linear Models (GLMs)

For fixed β (model parameter) and

1.0
each given x (sampled location),

0.8
µ = p = s(z), z = β0 x

0.6
mu
0.4
has the following interpretations:

0.2
• mean response

0.0
−4 −2 0 2 4
z
E(y | x, β) = s(z)

Population model:
• probability of success:
y | x, β ∼ Bernoulli(p = s(β 0 x))
P (y = 1 | x, β) = s(z)
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 11/24
Generalized Linear Models (GLMs)

A sample from the logistic regression model, with p = s(−3 + 2x)

beta=(−3,2)
1.0
0.8
0.6
y
0.4
0.2
0.0

0 1 2 3 4
x

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 12/24
Generalized Linear Models (GLMs)

Parameter estimation via MLE

Given a data set (x1 , y1 ), . . . , (xn , yn ),

1.0
fitting a logistic regression model is

0.8
equivalent to choosing the value of

0.6
β such that the mean response

y
0.4
0.2
µ = s(β 0 x)
0.0 0 1 2 3 4
x

matches the sample as “closely” as

possible.
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 13/24
Generalized Linear Models (GLMs)

Mathematically, the best β is usually found by maximizing the likelihood

of the sample:
n
Y
L(β | y1 , . . . , yn ) = f (y1 , . . . , yn | β) = f (yi | β)
i=1

where f (yi | β) is the probability function of the ith observation:


p , yi = 1
i
f (yi | β) = pyi i (1 − pi )1−yi =
1 − p yi = 0
i

and
1
pi =
1 + e−β0 xi
However, there is no closed-form solution, and the optimal β has to be
computed numerically.
Dr. Guangliang Chen | Mathematics & Statistics, San José State University 14/24
Generalized Linear Models (GLMs)

Prediction by logistic regression

Once the optimal parameter β̂ is found, the mean response at a new

location x0 is
1
E(y | x0 , β̂) =
1 + e−β̂0 x0

Note that this would not be our exact prediction at x0 (why?).

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 15/24
Generalized Linear Models (GLMs)

To make a prediction at x0 based on the estimates β̂, consider

1
y0 | x0 , β̂ ∼ Bernoulli(p̂0 ), p̂0 = .
1 + e−β̂0 x0

The prediction at x0 is

1, if p̂0 > 0.5
ŷ0 =
0, if p̂0 < 0.5

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 16/24
Generalized Linear Models (GLMs)

R scripts

x = c(162, 165, 166, 170, 171, 168, 171, 175, 176, 182, 185)
y = c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
model ← glm(y∼x,family=binomial(link=’logit’))

p = model$fitted.values
# p = [0.0168, 0.0708, 0.1114, 0.4795, 0.6026, 0.2537, 0.6026, 0.9176,
0.9483, 0.9973, 0.9994]

beta = model$coefficients # beta = [-84.8331094 0.4985354]

fitted.prob ← predict(model,data.frame(x=c(168,170,173)),type=’response’)
# fitted.prob = [0.2537, 0.4795 0.8043 ]

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 17/24
Generalized Linear Models (GLMs)

p=1/(1+exp(−84.8331+0.4985 x))
1.0
0.8
0.6
p
0.4
0.2
0.0

160 165 170 175 180 185 190

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 18/24
Generalized Linear Models (GLMs)

Other models for binary response data

Instead of using the logit link function,

1
p=
1 + e−β0 x
to force the estimated probabilities to lie between 0 and 1:

y | x, β ∼ Bernoulli(p)

one could use

• Probit: p = Φ(β 0 x), where Φ is the cdf of standard normal.

• Complimentary log-log: p = 1 − exp(− exp(β 0 x))

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 19/24
Generalized Linear Models (GLMs)

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 20/24
Generalized Linear Models (GLMs)

Poisson regression
Poisson regression is a GLM that combines the Poisson distribution (for the
response) and the log link function (relating mean response to predictors):

log (µ) = β 0 x (y ∼ Poisson(λ))

Remark. Since µ = E(y | x) = λ, we have

0
log λ = β 0 x or λ = eβ x

That is,
0
y | x, β ∼ Poisson(λ = eβ x )

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 21/24
Generalized Linear Models (GLMs)

beta=(1,−3)
80

sample
true model
fitted model
60
40
y
20
0

−1.0 −0.8 −0.6 −0.4 −0.2 0.0

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 22/24
Generalized Linear Models (GLMs)

R code

poisson.model ← glm(y∼x,family=poisson(link=’log’))

poisson.model$coefficients

(Intercept) x
1.003291 -3.019297

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 23/24
Generalized Linear Models (GLMs)

Summary and beyond

We talked about the concept of generalized linear models and its two
special instances:

• Logistic regression: logit link function + Bernoulli distribution

• Poisson regression: log link function + Poisson distribution

Note that parameter estimation for GLM is through MLE; prediction is

based on the mean (plus some necessary adjustments).

Further learning on logistic and multinomial regression:

http://www.sjsu.edu/faculty/guangliang.chen/Math251F18/lec5logistic.pdf

Dr. Guangliang Chen | Mathematics & Statistics, San José State University 24/24

S M S T C Lecture 2425
No ratings yet
S M S T C Lecture 2425
45 pages
GLM & Logistic
No ratings yet
GLM & Logistic
26 pages
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
No ratings yet
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
4 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
7 pages
Lecture BDS 2 23 24 Print
No ratings yet
Lecture BDS 2 23 24 Print
10 pages
6.1 - Introduction To GLMs
No ratings yet
6.1 - Introduction To GLMs
3 pages
Generalized Linear Models: Simon Jackman Stanford University
No ratings yet
Generalized Linear Models: Simon Jackman Stanford University
7 pages
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
No ratings yet
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
31 pages
Logistic Regression
No ratings yet
Logistic Regression
26 pages
GLM Theory
No ratings yet
GLM Theory
46 pages
15 GLM
No ratings yet
15 GLM
32 pages
07 GLM
No ratings yet
07 GLM
49 pages
7 Generalized Linear Models Padua
No ratings yet
7 Generalized Linear Models Padua
29 pages
Logistic Regression & GLMs Guide
No ratings yet
Logistic Regression & GLMs Guide
20 pages
Generalized Linear Models GLMs
No ratings yet
Generalized Linear Models GLMs
10 pages
RM - Elements of Generalised Linear Models (GLM) and Inference For GLM
No ratings yet
RM - Elements of Generalised Linear Models (GLM) and Inference For GLM
11 pages
Generalised Linear Model
No ratings yet
Generalised Linear Model
4 pages
Modelling Lecture 5
No ratings yet
Modelling Lecture 5
10 pages
Chapter 2
No ratings yet
Chapter 2
5 pages
Lecture 3
No ratings yet
Lecture 3
30 pages
Lecture 3
No ratings yet
Lecture 3
18 pages
Fisher Information For GLM
No ratings yet
Fisher Information For GLM
35 pages
Categorical Notes Ch4
No ratings yet
Categorical Notes Ch4
40 pages
GLM Slides 6 Binary Response Print
No ratings yet
GLM Slides 6 Binary Response Print
55 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
9 pages
Categorical Data Analysis: GLM Basics
100% (1)
Categorical Data Analysis: GLM Basics
53 pages
University of Illinois - Introduction To GLM
No ratings yet
University of Illinois - Introduction To GLM
39 pages
Generalized Linear Model: Badr Missaoui
No ratings yet
Generalized Linear Model: Badr Missaoui
35 pages
Week6 1 GLM
No ratings yet
Week6 1 GLM
28 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
4.2 Slides - Generalized Linear Mixed Models Part 1
No ratings yet
4.2 Slides - Generalized Linear Mixed Models Part 1
9 pages
Generalized Linear Models Guide
No ratings yet
Generalized Linear Models Guide
12 pages
GLMs in Python for Data Scientists
No ratings yet
GLMs in Python for Data Scientists
42 pages
Generalized Linear Models: Dr. Kempthorne
No ratings yet
Generalized Linear Models: Dr. Kempthorne
19 pages
James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
100% (1)
James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
789 pages
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
No ratings yet
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
6 pages
Logistics Regression
No ratings yet
Logistics Regression
56 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
30 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
SandipK - Generalized Linear Models Question Bank
No ratings yet
SandipK - Generalized Linear Models Question Bank
38 pages
Presentation Generalized Linear Model Theory
No ratings yet
Presentation Generalized Linear Model Theory
77 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
Class - Lectur 5&6
No ratings yet
Class - Lectur 5&6
12 pages
Regression models: ∼ N (µ, φ) µ Y ∼ P (µ, φ) g (µ) = Xβ
No ratings yet
Regression models: ∼ N (µ, φ) µ Y ∼ P (µ, φ) g (µ) = Xβ
6 pages
GLM Ohp
No ratings yet
GLM Ohp
6 pages
Probabilistic Learning and Generalized Linear Models (GLMS)
No ratings yet
Probabilistic Learning and Generalized Linear Models (GLMS)
38 pages
2.1972 Generalized Linear Models Nelder Wedderburn
No ratings yet
2.1972 Generalized Linear Models Nelder Wedderburn
16 pages
Actuarial GLMs Explained
No ratings yet
Actuarial GLMs Explained
74 pages
GU5232 Syl
No ratings yet
GU5232 Syl
2 pages
GAMS Getting Started
No ratings yet
GAMS Getting Started
31 pages
Stat5900 f24 Lec9
No ratings yet
Stat5900 f24 Lec9
12 pages
Chapman-Kolmogorov Equations 37 This Produces The 48511
No ratings yet
Chapman-Kolmogorov Equations 37 This Produces The 48511
9 pages
Lecture 13: Introduction To Generalized Linear Models: 21 November 2007
No ratings yet
Lecture 13: Introduction To Generalized Linear Models: 21 November 2007
12 pages
Advanced GLM Techniques Lecture
No ratings yet
Advanced GLM Techniques Lecture
61 pages
Intro to General Linear Models
No ratings yet
Intro to General Linear Models
3 pages
An Empirical Study of Generalized Linear Model For
No ratings yet
An Empirical Study of Generalized Linear Model For
4 pages
Softmax Regression for Data Scientists
100% (1)
Softmax Regression for Data Scientists
10 pages
Modelos Lineales Generalizados Con Ejemplos en R
No ratings yet
Modelos Lineales Generalizados Con Ejemplos en R
573 pages
Regression
No ratings yet
Regression
11 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
Matching, Regression Discontinuity, Difference in Differences, and Beyond 1st Edition Lee Download
100% (1)
Matching, Regression Discontinuity, Difference in Differences, and Beyond 1st Edition Lee Download
56 pages
Assignment 12 Homework 9 10 11 PG 295
No ratings yet
Assignment 12 Homework 9 10 11 PG 295
4 pages
CH - 10 - Basic Regression Analysis With Time Series Data
No ratings yet
CH - 10 - Basic Regression Analysis With Time Series Data
27 pages
Statistics Practice Workbook
No ratings yet
Statistics Practice Workbook
98 pages
ML - Lab-6.ipynb - Colab
No ratings yet
ML - Lab-6.ipynb - Colab
4 pages
Data Science Regression Models
No ratings yet
Data Science Regression Models
2 pages
AI34
No ratings yet
AI34
3 pages
Linear Regression
No ratings yet
Linear Regression
12 pages
Curvefitting PDF
No ratings yet
Curvefitting PDF
6 pages
PSQ Q2
No ratings yet
PSQ Q2
2 pages
01 - Simple Linear Regression
No ratings yet
01 - Simple Linear Regression
24 pages
Board Independence - 1
No ratings yet
Board Independence - 1
12 pages
Introductory Econometrics Econ 012
No ratings yet
Introductory Econometrics Econ 012
9 pages
SRM Statistical Learning Guide
No ratings yet
SRM Statistical Learning Guide
10 pages
Understanding Dependent vs. Independent Variables
No ratings yet
Understanding Dependent vs. Independent Variables
6 pages
DALab Part-B BCU&BU
No ratings yet
DALab Part-B BCU&BU
12 pages
E FINALS Econometrics-II MCQs
No ratings yet
E FINALS Econometrics-II MCQs
7 pages
SPSS For Starters, Part 2
100% (15)
SPSS For Starters, Part 2
16 pages
Machine Learning Exam Guide
No ratings yet
Machine Learning Exam Guide
19 pages
Eco 311 Module Test 2024 SE
No ratings yet
Eco 311 Module Test 2024 SE
9 pages
Supplier Defects Analysis 2014-2018
No ratings yet
Supplier Defects Analysis 2014-2018
21 pages
Regression Course Outline
No ratings yet
Regression Course Outline
5 pages
Curves and Curve Fitting
No ratings yet
Curves and Curve Fitting
37 pages
Midterm 1a Solutions
No ratings yet
Midterm 1a Solutions
9 pages
Regresi Berganda - Yulia Setianingsih 030220052
No ratings yet
Regresi Berganda - Yulia Setianingsih 030220052
6 pages
Mod3 Eda
No ratings yet
Mod3 Eda
16 pages
Problem Set 2
No ratings yet
Problem Set 2
3 pages
Multivariate Regression, Slides
No ratings yet
Multivariate Regression, Slides
61 pages

Ch13slides Generalized Linear Models

Uploaded by

Ch13slides Generalized Linear Models

Uploaded by

San José State University

Math 261A: Regression Theory & Methods

Generalized Linear Models (GLMs)

Dr. Guangliang Chen

• Chapter 13: 13.1 – 13.3

Outline of this presentation:

The model can be reformulate in terms of

• distribution of the response: y | x ∼ N (µ, σ 2 ), and

• dependence of the mean on the predictors: µ = E(y | x) = x0 β

Generalized linear models (GLM) extend linear regression by allowing

• a general distribution (with mean µ = E(y | x)) and

• a mean that depends on the predictors through a link function g:

In GLM, the response is typically assumed to have a distribution in the

• Normal - ordinary linear regression

• Bernoulli - Logistic regression, modeling binary data

• Binomial - Multinomial logistic regression, modeling general cate-

• Poisson - Poisson regression, modeling count data

• Exponential, Gamma - survival analysis

In theory, any combination of the response distribution and link function

Some combinations turn out to be much more useful and mathematically

Response distribution Link function g(µ) Use

• Logistic Regression: Predict the likelihood that a consumer of an

• Poisson regression: Modeling the number of children a couple has

• Exponential: Modeling the survival time (time until death) of

Remark. Since µ = E(y | x) = p, we have

Solving for µ (and also p), we obtain that

Properties of the sigmoid function:

• 0 < s(z) < 1 for all z

• s(z) monotonically increases

For fixed β (model parameter) and

A sample from the logistic regression model, with p = s(−3 + 2x)

Parameter estimation via MLE

Given a data set (x1 , y1 ), . . . , (xn , yn ),

matches the sample as “closely” as

Mathematically, the best β is usually found by maximizing the likelihood

where f (yi | β) is the probability function of the ith observation:

Prediction by logistic regression

Once the optimal parameter β̂ is found, the mean response at a new

Note that this would not be our exact prediction at x0 (why?).

To make a prediction at x0 based on the estimates β̂, consider

beta = model$coefficients # beta = [-84.8331094 0.4985354]

160 165 170 175 180 185 190

Other models for binary response data

Instead of using the logit link function,

one could use

• Probit: p = Φ(β 0 x), where Φ is the cdf of standard normal.

• Complimentary log-log: p = 1 − exp(− exp(β 0 x))

log (µ) = β 0 x (y ∼ Poisson(λ))

Remark. Since µ = E(y | x) = λ, we have

−1.0 −0.8 −0.6 −0.4 −0.2 0.0

Summary and beyond

• Logistic regression: logit link function + Bernoulli distribution

• Poisson regression: log link function + Poisson distribution

Note that parameter estimation for GLM is through MLE; prediction is

Further learning on logistic and multinomial regression:

You might also like