INTRODUCTION TO PANEL DATA ANALYSIS   FARIDAH NAJUNA MISMAN, PhD
FINANCE DEPARTMENT
                       USING EVIEWS   FACULTY OF BUSINESS & MANAGEMENT
                                      UiTM JOHOR
                                         PANEL DATA WORKSHOP-23&24 MAY 2017   1
OUTLINE
1. Introduction
2. CLRM Assumptions
3. Static Panel Data Models
4. Getting Start with EViews 9
5. Data Analysis
6. Reading The Results
                                 PANEL DATA WORKSHOP-23&24 MAY 2017   2
1. INTRODUCTION
There are 3 types of data structure available:
1. Time Series data is data that is collected at regular time intervals such as every
   month or every year. (N=1, t=1……T)
  • Usually this represents the values for a single firm or a single variable at different points in time.
  • Most macroeconomic data for real variables e.g. GDP or Consumption, is quarterly time series data.
  • The data for monetary variables such as Interest rates is often monthly time series data.
2. Cross sectional data is data associated with the values of many different firms or
   households that is collected at a single point in time. (i=1……N, T=1)
3. Panel data is a combination of the other two where we have values for all
   members of a panel or group of firms or households measured at more than one
   period in time. (i=1…..N, t=1……T)
                                                                                                  PANEL DATA WORKSHOP-23&24 MAY 2017   3
1. INTRODUCTION
 Classical panel data: N>T or known as short or micro panel
 Macro panel: T>N or known as long panel
 Balanced panel : data available for all cross section for all periods.
         No of observation: n = NT
 Unbalanced panel : different T for individual. (notes: Eviews cannot read
unbalanced panel)
                                                                           PANEL DATA WORKSHOP-23&24 MAY 2017   4
1. INTRODUCTION
 Selection of econometric models will depend o type of data:
1.    Least Squares Regression: Normally applied to cross-section data set (e.g
      Ordinary Least Squares , OLS)
2. Time-series Model: Normally applied to time series data, to uncover long run
   relations and short run dynamics.
3. Panel Data Modelling: Normally used to capture heterogeneity across samples
   and due to the need to have bigger sample size.
     ❖ Statics
             Panel data model : POLS, FE, RE, BE
     ❖Dynamic panel data: GMM
     ❖Panel unit root and cointegration (macro panel)
                                                                        PANEL DATA WORKSHOP-23&24 MAY 2017   5
1. INTRODUCTION
Advantages & Disadvantages
 Panel Data allow us to control for variables you cannot observe or measure such as:
   ❖ Time-invariant factors like geographical area, firm management characteristics.
   ❖ Variables that change over time but not across entities like national policies, federal regulation, international
     agreements.
 In other word, panel data is able to take into account for individual
heterogeneity (uniqueness)- resulted efficient estimates
                                                                                                    PANEL DATA WORKSHOP-23&24 MAY 2017   6
1. INTRODUCTION
Advantages:
i.    Larger sample size, more variation, less collinearity therefore it will increased
      precision of estimates
ii.   Ability to study the dynamic- repeated cross-sectional observations-adjustment
      over times
iii. Ability to account for heterogeneity across individual often ignored in pooled
     data-more robust against misspecification due to omitted variable
Disadvantages:
i.    Data availibity/maintenance
ii.   Measurement errors
iii. Elf-selection bias
                                                                            PANEL DATA WORKSHOP-23&24 MAY 2017   7
1. INTRODUCTION
Why Analyse Panel Data?
We are interested in describing change over time o social change, e.g.
 changing attitudes, behaviours, social relationships o individual growth or
 development, e.g. life-course studies, child development, career trajectories,
 school achievement o occurrence (or non-occurrence) of events
We want superior estimates trends in social phenomena o Panel models can
 be used to inform policy – e.g. health, obesity o Multiple observations on
 each unit can provide superior estimates as compared to cross-sectional
 models of association
We want to estimate causal models o Policy evaluation o Estimation of
 treatment effects
                                                                 PANEL DATA WORKSHOP-23&24 MAY 2017   8
1. INTRODUCTION
What kind of data are required for panel analysis?
Basic panel methods require at least two “waves” of measurement. Consider student
GPAs and job hours during two semesters of college
One way to organize the panel data is to create a single record for each
combination of unit and time period
Notice that the data include:
  A time-invariant unique identifier for each unit (StudentID)
  A time-varying outcome (GPA)
  An indicator for time (Semester)
Panel datasets can include other time-varying or time-invariant variables
                                                                       PANEL DATA WORKSHOP-23&24 MAY 2017   9
2.CLASSICAL LINEAR REGRESSION MODEL (CLRM)
                                                                                      PANEL DATA WORKSHOP-23&24 MAY 2017   10
Table taken from page 37, “Applied Econometrics:, Asteriou & Hall, 2nd ed. 2011, Palgrave      Macmillan
3. PANEL DATA MODEL: POOLED OLS
 Pooled OLS
                                yit = β0 + βit Xit + αi + νit
i.     αi and vit are normally distributed and they are mutually independent,
ii.    E(αi) = E(vij) = 0, for i = 1,...,m, j = 1,2,...,m(i),
                      12 ,    i i 
iii.   E(αiαi´) =   
                     0,        otherwise,
                       22 ,     i i , j  j 
iv.    E(vijvi´j´) = 
                      0,         otherwise.
                                                                                PANEL DATA WORKSHOP-23&24 MAY 2017   11
4.GETTING START WITH EViews 9
                                PANEL DATA WORKSHOP-23&24 MAY 2017   12
PANEL DATA WORKSHOP-23&24 MAY 2017   13
PANEL DATA WORKSHOP-23&24 MAY 2017   14
PANEL DATA WORKSHOP-23&24 MAY 2017   15
PANEL DATA WORKSHOP-23&24 MAY 2017   16
PANEL DATA WORKSHOP-23&24 MAY 2017   17
PANEL DATA WORKSHOP-23&24 MAY 2017   18
PANEL DATA WORKSHOP-23&24 MAY 2017   19
PANEL DATA WORKSHOP-23&24 MAY 2017   20
5. DATA ANALYSIS
                   PANEL DATA WORKSHOP-23&24 MAY 2017   21
DESCRIPTIVE STATISTICS
                         PANEL DATA WORKSHOP-23&24 MAY 2017   22
PANEL DATA WORKSHOP-23&24 MAY 2017   23
CORRELATION ANALYSIS
                       PANEL DATA WORKSHOP-23&24 MAY 2017   24
PANEL DATA WORKSHOP-23&24 MAY 2017   25
PANEL DATA WORKSHOP-23&24 MAY 2017   26
PANEL DATA WORKSHOP-23&24 MAY 2017   27
POOLED OLS REGRESSION
                        PANEL DATA WORKSHOP-23&24 MAY 2017   28
PANEL DATA WORKSHOP-23&24 MAY 2017   29
PANEL DATA WORKSHOP-23&24 MAY 2017   30
PANEL DATA WORKSHOP-23&24 MAY 2017   31
NORMALITY TEST
                 PANEL DATA WORKSHOP-23&24 MAY 2017   32
PANEL DATA WORKSHOP-23&24 MAY 2017   33
PANEL DATA WORKSHOP-23&24 MAY 2017   34
DUMMY VARIABLES
                  PANEL DATA WORKSHOP-23&24 MAY 2017   35
PANEL DATA WORKSHOP-23&24 MAY 2017   36
PANEL DATA WORKSHOP-23&24 MAY 2017   37
PANEL DATA WORKSHOP-23&24 MAY 2017   38
PANEL DATA WORKSHOP-23&24 MAY 2017   39
PANEL DATA WORKSHOP-23&24 MAY 2017   40
                                             6.READING THE RESULTS
                        Dependent Variable: CR
                        Method: Panel Least Squares
                        Date: 05/23/17 Time: 17:06                            Total no of groups
                        Sample (adjusted): 1996 2011      Time included
                        Periods included: 16
                                                                                     n=NT
                        Cross-sections included: 17
                        Total panel (unbalanced) observations: 85
                                  Variable                Coefficient        Std. Error            t-Statistic                   Prob.
       Constant                      C                     12.83313          2.387841              5.374368                    0.0000
                                    FE                    -0.160617          0.039199             -4.097434                    0.0001
                                    FQ                     2.032662          0.380137              5.347179                    0.0000
                                    CB                     0.362423          0.185213              1.956787                    0.0539
                                   CAPR                   -0.203388          0.075746             -2.685126                    0.0088
If this no is < 0.05
 then the model is
ok.
                        R-squared                          0.371546       Mean dependent var                             6.020596
This is F test to see   Adjusted R-squared                 0.340123       S.D. dependent var                             5.639222
whether all coeffs in
the model are diff      S.E. of regression                 4.580898       Akaike info criterion                          5.938690
than zero.              Sum squared resid                  1678.770       Schwarz criterion                              6.082375
                        Log likelihood                    -247.3943       Hannan-Quinn criter.                           5.996484
                        F-statistic                        11.82412       Durbin-Watson stat                             0.735389
                        Prob(F-statistic)                  0.000000                                              PANEL DATA WORKSHOP-23&24 MAY 2017   41
                      Coefficient      Std. Error          t-Statistic   Prob.
                       12.83313        2.387841            5.374368      0.0000
                      -0.160617        0.039199           -4.097434      0.0001
                       2.032662        0.380137            5.347179      0.0000
                       0.362423        0.185213            1.956787      0.0539
                      -0.203388        0.075746           -2.685126      0.0088
Coefficients of the
regressors.
Indicate how much
Y changes                                                                  Two-tail p-values test the
                               T-values test the hypothesis that           hypothesis
When X increase                each coeff is diff from 0
by one unit.                                                               That each coeff is diff
                               To reject this, the t-value has to be       from 0. To reject this,
                               higher than 1.96 (95% confidence            P-value has to be lower
                               interval). If this is the case then you     than 0.05 (95%). If this is
                               can say that the variables has a            Case the you can say that
                               significant influence on your DV            the variable has a
                               (Y). The higher the value the higher        significant influence
                               the relevance of the variable.              On you      DV (Y)
                                                                               PANEL DATA WORKSHOP-23&24 MAY 2017 42
               R-squared             0.371546   Mean dependent var         6.020596
               Adjusted R-squared    0.340123   S.D. dependent var         5.639222
               S.E. of regression    4.580898   Akaike info criterion      5.938690
               Sum squared resid     1678.770   Schwarz criterion          6.082375
               Log likelihood       -247.3943   Hannan-Quinn criter.       5.996484
               F-statistic           11.82412   Durbin-Watson stat         0.735389
               Prob(F-statistic)     0.000000
                                                 Adjusted R-squared shows the same
R-squared shows the amount                       as R-squared but adjusted by the
Of variance of Y explained by X                  number of cases and number of
                                                 variables.
                                                 When the number of variables is
                                                 small and the number of cases is
                                                 very large,
                                                 then Adj R-squared is closer to R-
                                                 squared
                                                                               PANEL DATA WORKSHOP-23&24 MAY 2017   43