Panel Data Analysis:-Using
Fixed & Random Effects
Panel data (also known as longitudinal or cross-sectional time-series data) is a
dataset in which the behaviour of entities is observed across time. These
entities could be states, companies, individuals, countries, etc.
Panel data allows us to control for variables we cannot observe or measure like
cultural factors or difference in business practices across companies; or
variables that change over time but not across entities (i.e. national policies,
federal regulations, international agreements etc.). That is, it accounts for
individual heterogeneity.
With panel data we can include variables at different levels of analysis (i.e.
students, schools, districts, states) suitable for multilevel or hierarchical
modelling. Some drawbacks are data collection issues (i.e. sampling design,
coverage), non-response in the case of micro panels or cross-country
dependency in the case of macro panels (i.e. correlation between countries)
In this report we will basically use two techniques to analyse panel data, Fixed
and Random.
FIXED-EFFECTS MODEL
Use fixed-effects (FE) whenever we are only interested in analysing the impact
of variables that vary over time. FE explore the relationship between predictor
and outcome variables with an entity (country, person, company, etc.). Each
entity has its own individual characteristics that may or may not influence the
predictor variables (for example being a male or female could influence the
opinion toward certain issue or the political system of a particular country
could have some effect on trade or GDP or the business practices of a company
may influence its stock price).
When using FE we assume that something within the individual may impact or
bias the predictor or outcome variables and we need to control for this. This is
the rationale behind the assumption of the correlation between entitys error
term and predictor variables. FE removes the effect of those time-invariant
characteristics from the predictor variables so we can assess the predictors
net effect.
Another important assumption of the FE model is that those time-invariant
Characteristics are unique to the individual and should not be correlated with
Other individual characteristics. Each entity is different therefore the entitys
Error term and the constant (which captures individual characteristics) should
not be correlated with the others. If the error terms are correlated then FE is
no suitable since inferences may not be correct and you need to model that
relationship (probably using random-effects), this is the main rationale for the
Haussman test.
The equation for the General fixed effects model Is:
Yit = 1Xit + i + uit [eq.1] .
Where :
i (i=1.n) is the unknown intercept for each entity (n entity-specific
intercepts).
Yit is the dependent variable (DV) where i = entity and t = time.
Xit represents one independent variable (IV),
1 is the coefficient for that IV,
uit is the error term .
Fixed effects: n entity-specific intercepts (using xtreg):-
The fixed-effects model controls for all time-invariant differences between the
individuals, so the estimated coefficients of the fixed-effects models cannot be
biased because of omitted time-invariant characteristics[like culture, religion,
gender, race, etc].One side effect of the features of fixed-effects models is that
they cannot be used to investigate time-invariant causes of the dependent
variables. Technically, time-invariant characteristics of the individuals are
perfectly collinear with the person [or entity] dummies. Substantively, fixed-
effects models are designed to study the causes of changes within a person [or
entity]. A time-invariant characteristic cannot cause such a change, because it
is constant for each person.
RANDOM-EFFECTS MODEL
The rationale behind random effects model is that, unlike the fixed effects
model, the variation across entities is assumed to be random and uncorrelated
with the predictor or independent variables included in the model: the
crucial distinction between fixed and random effects is whether the
unobserved individual effect embodies elements that are correlated with the
regressors in the model, not whether these effects are stochastic or not
[Green, 2008, p.183] If you have reason to believe that differences across
entities have some influence on your dependent variable then you should use
random effects. An advantage of random effects is that you can include time
invariant variables (i.e. gender). In the fixed effects model these variables are
absorbed by the intercept.
The random effects model is:-
Yit = Xit + + uit + it
Where:- it =within-entity error
Uit=between-entity error
Random effects assume that the entitys error term is not correlated with the
predictors which allows for time-invariant variables to play a role as
explanatory variables. In random-effects you need to specify those individual
characteristics that may or may not influence the predictor variables. The
problem with this is that some variables may not be available therefore leading
to omitted variable bias in the model.
Fixed or Random: Hausman
test
To decide between fixed or random effects we can run a Hausman test where
the null hypothesis is that the preferred model is random effects vs. the
alternative the fixed effects. It basically tests whether the unique errors (ui)
are correlated with the regressors, the null hypothesis is they are not.
We run a fixed effects model and save the estimates, then run a random model
and save the estimates, then perform the test.
Testing for Heteroskedasticity:
A test for heteroskedasticiy is avalable for the fixed- effects model using the
command xttest3(stata).
This is a user-written program, to install it type ssc install xtest3.
The null is homoskedasticity (or constant variance). Above we reject the null
and conclude heteroskedasticity.