KEMBAR78
Qualitative Response Regression Models | PDF | Regression Analysis | Logistic Regression
0% found this document useful (0 votes)
115 views6 pages

Qualitative Response Regression Models

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views6 pages

Qualitative Response Regression Models

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Lesson 2 Qualitative response regression models

2.1 Introduction

The econometrics content you have covered so far (mostly undergraduate courses) most likely
dealt with a quantitative dependent variable, and with explanatory variables that were either
quantitative or qualitative (dummy), or a mixture of the two. Although this lesson will recap some
of these concepts, the aim is to further introduce you to several models in which the dependent
variable itself is qualitative by nature.

This type of analysis is often used in the social sciences, including medical research. However,
it also cautions that although qualitative response regression models pose interesting
estimation options, they do have some interpretation challenges. This means that although
some of the estimation techniques might seem straightforward, one should always tread
carefully when analysing or interpreting the results.

In developing an informed and critical understanding of qualitative information regression


models, you should pay attention to

• the definition and application of using binary (or dummy variables) in models
• the definition and application of linear probability models (LPM)
• the definition and application of logit and probit models

Textbook chapters for you to study

In order to sufficiently cover the content related to qualitative information regression models,
chapters from both Wooldrigde (your textbook) and Gujarati (Basic Econometrics) are
recommended. The chapter from Gujarati covering qualitative response regression models (i.e.
Chapter 15) is available for you to download under the Additional Resources tab.

The content/terms covered in this lesson are as follows (note that only the sections mentioned
in brackets below are for examination purposes):

Wooldridge:
- dummy variables (7-1 to 7- 4)
- linear probability models (LPM) (7- 5)
- logit and probit models (17-1)

Gujarati:
- linear probability models (LPM) (15.1 to 15.4)
- logit models (15.5 to 15.8)

Open Rubric
2.2 Analysis of variance (ANOVA) models

Types of variables

ratio scale
interval scale
ordinal scale
nominal scale

Definitions

Qualitative information can be captured in an econometric model by using a binary variable


(also referred to as a zero-one or a dummy variable). Dummy variables entail essentially
classifying data into mutually exclusive categories such as female and male, employed and
unemployed, people living in urban or rural areas, etc. Dummy variables do not have a natural
scale of measurement, and are thus described as nominal scale variables (for further
explanation, see Wooldridge: 7-1).

Caution: Dummy variable trap!

Intercept dummies, slope dummies and interaction effects: These topics are covered in detail
in Wooldridge 2020: Chapter 7.2–7.4. Also note the difference in interpretation between
specifying equations in levels or in logs.

Also see recommended reading (lesson 2.9) dealing with ‘using dummy variables’

Seasonal analysis

How can dummy variables be used to “deseasonalise” data?

2.3 Activity 2.1: Dummies, ANOVA and ANCOVA models

Re-do examples 7.1, 7.5 and 7.10 in the textbook.

The data is available as an EViews file: wage1.wf1 under Additional


Resources/Activities/Activity data.

Also note the recommended reading, including various examples of ‘using dummy variables’,
in lesson 2.9.
2.4 Linear probability models (LPM)

Terminology

binary or dichotomous
trichotomous
polychotomous or multiple category
weighted least squares (WLS)
response probability
uncentered R-Squared
treatment group

In a model where Y is quantitative, our objective is to estimate its expected, or mean, value
given the values of the explanatory variables. However, in models where Y is qualitative, our
objective is to find the probability of something happening; thus, qualitative response regression
models are often known as probability models.

Read sections 7-5 to 7-7 in the textbook and take special note of the various potential problems
of LPM, including

- non-normality of the disturbances


- heteroscedastic variances of the disturbances
- predictions of less than zero or greater than one
- questionable measures of goodness of fit (R-squared readings)

2.5 Activity 2.2: LMP models

Re-do example 7.12 in the textbook.

The data is available as an Excel file: Activity 2.2 data.xls under Additional
Resources/Activities/Activity data.

Hint: Create a new “unstructured/undated” EViews workfile with 2725 observations. Copy/paste
the data from Excel into EViews. (When pasting into EViews remember to leave space in the
top row for the names of the variables.)

2.6 The logit model

Terminology

cumulative distribution function


cumulative logistic distribution function
odds ratio
logit
maximum likelihood, ML estimation (only the basics)
pseudo R²
count R²
likelihood ratio (LR) statistic
From our economic studies and from observing human behaviour, we know that the “real world”
is very rarely straightforward and that human behaviour can almost never be modelled using a
linear (straight line) approach. Thus building on the basis of LPM, Gujarati (2011:553) notes
that what we need is a (probability) model with two special features:

- As X i increases, Pi = E (Y = 1 X ) increases but never stops outside the 0-1 interval, and
- the relationship between Pi and X i is non-linear (a typical S-shaped curve).

To model the S-shaped curve, one can use the cumulative distribution function (CDF), which
closely resembles this shape.

NB: Make sure to note the confirmation of the above special features as well as other features
of the logit model listed in Gujarati (2011:555).

Note the difference in application between the two types of data:

- data at individual or micro level (Maximum-likelihood (ML) estimation techniques.


however, fall outside the scope of this course and is not for examination purposes.)
- grouped or replicated data (weighted least squares)

For examination purposes, please note that only the work on pages 541–566 should be
studied. For logit modelling, you should focus on the grouped or replicated type data. You are
welcome to look at various other models (e.g. probit, tobit, etc) dealt with in the last part of
Chapter 15 (i.e. p 566 to the end), however, they do not form part of the official study material.

2.7 Activity 2.3 The logit model for ungrouped or individual data

Notes on tables 15.7–15.8:

The data (Excel format) is available on myUnisa/Additional Resources/Activities/Activity data


Table 15_7.xlsx.

Create a new workfile in EViews: unstructured with 32 observations.

Select the variables (starting with Grade, …) and right click on “open as equation”. In the
equation estimation tab, go to the dropdown menu and select “BINARY”. In the binary
estimation method, select “logit”.
Your output should look as follows (i.e. exactly the same result as provided in table 15.8):

Dependent Variable: GRADE


Method: ML - Binary Logit (Quadratic hill climbing)
Date: 05/19/14 Time: 13:39
Sample: 1 32
Included observations: 32
Convergence achieved after 5 iterations
Covariance matrix computed using second derivatives

Variable Coefficient Std. Error z-Statistic Prob.

C -13.02135 4.931324 -2.640537 0.0083


GPA 2.826113 1.262941 2.237723 0.0252
TUCE 0.095158 0.141554 0.672235 0.5014
PSI 2.378688 1.064564 2.234424 0.0255

McFadden R-squared 0.374038 Mean dependent var 0.343750


S.D. dependent var 0.482559 S.E. of regression 0.384716
Akaike info criterion 1.055602 Sum squared resid 4.144171
Schwarz criterion 1.238819 Log likelihood -12.88963
Hannan-Quinn criter. 1.116333 Deviance 25.77927
Restr. deviance 41.18346 Restr. log likelihood -20.59173
LR statistic 15.40419 Avg. log likelihood -0.402801
Prob (LR statistic) 0.001502

Obs with Dep=0 21 Total obs 32


Obs with Dep=1 11

2.8 Activity 2.4 The Grouped Logit (Glogit) model


See textbook Section 15.7, pages 558–561):

Redo the exercise, using Excel and EViews where applicable. You should be able to recompile
the table provided on pages 559-560, and ultimately the graph on page 561.

The data and steps (Excel/EViews files) is available under myUnisa/Additional


Resources/Activities/Activity data Table 15.5 solution.

2.9 Recommended reading

Using dummy variables:

Link to selections from ECS3706 (Econometrics), study unit 7 (to be posted on


myUnisa/Additional Resources).

How do I interpret odds ratios in logistic regression? [Online resource]:


http://www.ats.ucla.edu/stat/mult_pkg/faq/general/odds_ratio.htm

You might also like