Bayesian Hierarchical Analysis On Crash Prediction Models
Bayesian Hierarchical Analysis On Crash Prediction Models
http://pubsindex.trb.org/view.aspx?id=844461
Huang et al. 1
Helai Huang *
Research Fellow
Department of Civil Engineering
National University of Singapore
Engineering Drive 2,
Singapore, 117576
Tel: 65 6516 2255
Email: huanghelai@nus.edu.sg
* Corresponding author
Huang et al. 2
ABSTRACT
Traditional crash prediction models, such as generalized linear regression models, are
incapable of taking into account the multilevel data structure, which extensively exists in
crash data. Disregarding the possible within-group correlations can lead to the production of
models giving unreliable and biased estimates of unknowns. This study innovatively proposes
a 5 × T -level hierarchy, viz. (Geographic region level – Traffic site level – Traffic crash level
– Driver-vehicle unit level – Vehicle-occupant level) × Time level, to establish a general form
of multilevel data structure in traffic safety analysis. To properly model the potential cross-
group heterogeneity due to the multilevel data structure, a framework of Bayesian hierarchical
models that explicitly specify multilevel structure and correctly yield parameter estimates is
introduced and recommended. The proposed method is illustrated in an individual-severity
analysis of intersection crashes using the Singapore crash records. This study proved the
importance of accounting for the within-group correlations and demonstrated the flexibilities
and effectiveness of the Bayesian hierarchical method in modeling multilevel structure of
traffic crash data.
Huang et al. 3
INTRODUCTION
Crash prediction model (also called safety performance function) is one of the most important
techniques in investigating the relationship between crash occurrence and risk factors
associated with various traffic entities. These risk factors are assumed to provide information
on the behavior of the crash occurrence, which is commonly measured by crash frequency
with various degrees of crash severity. Appropriate probabilistic forms and statistically
significant factors are identified based on the examination of crash occurrence mechanism and
model fitting performance to the historical crash data. The typical structure of such models
could be expressed as a general form as follows:
Y | θ ~ Dist (θ)
(1)
with θ = f ( X, β, ε)
relationship of crash count and exposure, a number of selected road segments may be nested
in several areas of interest (e.g. cities). Moreover, for each selected road segment, there may
be several observations from different time periods. In this case, some cross-group
heterogeneities, either observed or unobserved, may exist due to spatial and temporal effects
of crash data. Indeed, some characteristic variations may necessarily exist between different
areas or between road segments.
For instance, suppose the data in the above example are collected from four different
areas, in each of which a number of road segments are involved in the study. The Cases (b)-(f)
in Figure 2 illustrate various potential relationships between crash count and exposure. As
discussed previously, if a standard GLM model is used on the aggregate dataset (Case (a)), the
area-level context in which the road segments belong to is completely ignored: the same
single straight line relationship is held to exist everywhere. In effect the model has explained
“everything in general and nothing in particular”. However, given the different features
among the areas such as the road density, there may be varying crash count/exposure
relationships. One possible result shown in the Case (b) of Figure 2, is the varying-intercept
pattern. Here each of four areas has their own crash count/exposure relation represented by a
separate line. The single thicker line represents the general relationship across all four areas.
The parallel lines imply that, while the crash count/exposure relation in each area is the same,
some areas have uniformly higher crash frequency than others. In Case (c) and (d), the
situations are more complicated as the steepness of the lines varies from area to area. In Case
(c), the pattern is such that areas make very little difference for the relatively low exposure
roads but there is a high degree of between-area variation in crash count for higher exposure
roads. In contrast, Case (d) shows large area-specific differentials exist for the road segment
with lower exposure. In Case (e), there is a complex interaction between crash count and
exposure. In some areas it is the lower exposure roads which have relatively high crash
frequency, whereas in others it is the higher exposure roads. While the final plot, Case (f), is
unlikely to occur in terms of the present example, it can be expected in some other risk
factors. Across all the areas there is no relationship between the crash count and the risk factor
(the single thicker line is horizontal) but in specific area there are distinctive relationships.
This situation is similar to Case (c) but here the differences result from some areas having
high crash frequency for high value of the risk factor, while in others they have the lowest
frequency.
The cross-area variations of slopes and intercepts could be caused by various area-
specific heterogeneities. For those observable heterogeneities, it is theoretically possible to
factorize and then account them by using some classical techniques such as GLM with
consideration of interactions, ANOVA, or ANOCOVA. But traffic crash is a complex event
with a large number of factors involved. Ideally, all of the relevant factors in different levels
(e.g. road segment level and area level) should be considered in the model. In practice,
however, some of the factors may not be available or even collectable for study. A model may
only consider the most important factors and omit the others. It assumes that similar groups
(i.e. with same selected observable factors) have the same pattern of crash occurrence. In the
real world, however, similar groups (e.g. area) may be different in omitted factors and thus
may have different means. These unobservable or omitted heterogeneities introduce additional
variance to the data and cause the over-dispersion. Consequently, without appropriately
accounting for the cross-group heterogeneities, the estimates of the standard error in the
regression coefficients may be underestimated.
The patterns shown in the above example exist almost everywhere in the traffic safety
studies since most crash datasets are collected with an inherent multilevel structure. For
example, in predicting crash severities, it is reasonable to assume that the characteristics of the
Huang et al. 6
vehicles within which casualties are traveling will affect their probability of survival. If this is
the case, then casualties within the same vehicle would tend to have more similar severity
than casualties in different vehicles, rendering the assumption of residual independence
invalid. The same argument may be extended to encompass the effect of similarities between
different crashes, traffic sites, or geographical regions.
both individual and contextual levels. The two-entity-level design can be naturally extended to
reflect three-entity-level data structure, for example, Geographic region level – Traffic site
level – Traffic crash level. Moreover, when time series are considered, panel data design or
repeated cross-sectional design could be used. In panel data design, a set of sites within
regions are pre-selected on which repeated measures along time are conducted, whereas
repeated cross-sectional design consider a number of time periods, in each of which selected
sites may be different.
The previous sections show that appropriate method is needed to account for the multilevel
data structure in traffic safety discipline. In this section, a methodological framework using
Bayesian hierarchical modeling is established to properly model the potential heterogeneities
due to the multilevel data structure. A number of advantages of this method ensure its great
potential of extensive applications in traffic safety discipline.
Hierarchical Models
To model the multilevel data structure, several potential solutions have been found in the
literature. For example, some researchers have employed the artificial intelligent models (AI)
in crash prediction such as the most widely used neural networks (NN) and Bayesian NN (20-
23). But the NN has been criticized for being black boxes incapable of generating explicit
functional relationships and statistically interpretable results. Another useful technique in
accounting for correlated data is generalized estimating equations (GEE), which is regarded as
an extension of GLM (24, 25). GEE is also called as ‘marginal’ model, as distinguished from
‘subject specific’ model, such as hierarchical model in this paper. When dealing with
multilevel data structure, GEE aims to provide estimates with acceptable properties only for
the fixed parameters in the model, while treating the existence of any random parameters as a
necessary ‘nuisance’. Hence, the GEE may merely be superior in the case where the exact
form of the multilevel data structure is unknown.
Another way to distinctly address the multilevel data structure is to use hierarchical
models (also called as multilevel model or random effect model). Hierarchical modeling is a
statistical technique that allows multilevel data structures to be properly specified and
estimated (see 26). Although the basic theories of hierarchical model have been developed
and discussed for many years, it is only recent that many practical limitations on the use of
hierarchical analysis have been overcome. Currently, hierarchical models have become
commonplace in research in a variety of other disciplines such as sociology, education,
political science, and public health. In employing the hierarchical model in the first
application in traffic crash study, Shankar et al. (27) showed that the introducing of site-
specific random effects and time indicators into the NB regression model can significantly
improve the explanatory power of crash models. Jones and Jorgenson (28) presented a good
exploration and discussion on the potential applications of the hierarchical models. Since then,
the hierarchical modeling technique has been gaining an increasing amount of attention in
accounting for the multilevel data structure in crash prediction. For example, while some
researchers (29-32) employed the hierarchical models for predicting crash frequency, others
(28, 33, 34) developed hierarchical models to identify factors affecting crash severity.
As defined by Gelman and Hill (26), a multilevel/hierarchical model is a regression (a
linear or generalized linear model) in which the parameters – the regression coefficients – are
given a probability model. Hence, this higher-level model has parameters of its own – the
hyperparameters of the model – which are also estimated from the data. In the context of
Huang et al. 8
GLM, the hierarchical modeling (also called hierarchical GLMs) is mainly working on the
link function: disturbance terms are added to the model corresponding to different sources of
variation in the multilevel data.
Specifically, recall the general expression of statistical modeling in Equation (1), while
the first part of the expression ( Dist (θ) ) remains to represent different characteristics of crash
feature of interest, it is the disturbance term ε which differs the hierarchical modeling to
classical statistical models. It should be noted that here the ε represents a general concept for
the disturbances. In fact, it could consist of many components, with some of which working
on the intercept, others on the slopes in the link function.
A two-level hierarchical model is used to mathematically interpret how the method
works on the multilevel data. As with most practices, a basic linear combination of X, β is
assumed to simplify the interpretation. Furthermore, the covariate vector X is divided into
three components, c(1, X L1 , X L 2 ), to respectively represent the factors associated with
intercept, individual level (level 1) and group level (level 2). Correspondingly, β and ε are
also divided into different components to serve different functions with the bold symbol
representing vector or matrix. Hence, the link function becomes the combination of models in
terms of two levels,
Level 1 model: f −1
(θ) = β 0L1 + X L1β L1 + ε L1 (2)
Level 2 model: β L1
0 =β L2
00 +X β
L2 L2
0 +ε L2
0
β L1
=β L2
01 +X β
L2 L2
1 + ε 1L 2
The combined model is obtained by substituting the Level 2 model into Level 1 model,
f −1
(θ) = ( β 00L 2 + X L1β 01
L2
+ X L 2 β 0L 2 + X L1 X L 2 β1L 2 ) + (ε L1 + ε 0L 2 + X L1ε 1L 2 ) (3)
It is clear that now the link function consists of two parts: fixed part and random part.
The fixed part means a deterministic relationship fully depending on covariate X , while
random part is stochastically determined by a number of disturbance terms. The components
in both the two parts are interpreted as follows,
Fixed part:
1) β 00L 2 : The intercept, which is the main effect with all covariates equal zero. By centering all
covariates on their mean, this term represents the main effect on the average values of
covariates.
2) X L1β 01 L2
: β 01
L2
is the mean of the main-effect coefficient of level 1 covariates X L1 on the
dependent variable.
3) X L 2 β 0L 2 : β 0L 2 is the main-effect coefficient of level 2 covariates X L 2 on the dependent
variable.
4) X L1 X L 2 β1L 2 : β1L 2 is the interactive-effect coefficient of X L1 and X L 2 . This component make
it possible to in-depth understand how contextual factor (level 2 covariates) could affect the
individual factor (level 1 covariates).
Huang et al. 9
Random part:
It is clear that ε 0L 2 and ε1L 2 are the unique features of hierarchical models while all of
the rest components could be included and estimated in classical models. It is just these two
stochastic terms making it possible to account for the unobservable or omitted heterogeneity
in Level 2 model.
In the framework of hierarchical modeling, the two-level model shown in Equation (3)
is also called as varying-intercept and varying-slope model as defined by Gelman and Hill
(2007). Obviously, this full-version model could be simplified by taking account of either
varying intercept or varying slope, resulting in varying-intercept model and varying-slope
model.
Varying-intercept model:
f −1
(θ) = ( β 00L 2 + X L1β 01
L2
+ X L 2 β 0L 2 + X L1 X L 2 β1L 2 ) + (ε L1 + ε 0L 2 ) (4)
Varying-slope model:
f −1
(θ) = ( β 00L 2 + X L1β 01
L2
+ X L 2 β 0L 2 + X L1 X L 2 β1L 2 ) + (ε L1 + X L1ε 1L 2 ) (5)
Clearly, all of these models could be expanded to accord with more complicated
designs. The above derivative also showed that the hierarchical modeling provides a rather
flexible technique to account for various study purposes and different extent of model
Huang et al. 10
1) BI can accumulate evidences from any information sources regarding crash prediction. A
special property of the crash prediction models among most the traffic safety problems is
that the data is difficult to collect and gradually available along the time scale, e.g. year by
year. And furthermore, there are many possible variations for the prediction models itself
as the outcome of changes of some influential factors, e.g. the installation of red light
camera, or the adjust of amber interval time. This means that, to make the models valid,
we need update them periodically with the coming of new data. The Bayesian algorithm
provides a quite flexible and reliable measure to realize this updating requirement. In
Bayesian context, the previous model, any engineering experiences or justified previous
findings could be used as the prior knowledge of the updated model (37).
2) Missing data occur very commonly in crash records. In Bayesian method, missing data are
automatically modeled as latent variables in a manner that takes into account the
information contained in other observed data.
3) Bayesian posterior distributions for parameters are perfectly valid for any size of sample.
One of the most important strength of BI is the capability to handle small size data. The
extensive application of empirical Bayesian approach in observational before-after study
of safety treatment evaluation is a good supportive example of this statement (Hauer,
1997).
4) Regarding model comparison, frequentist hypothesis tests require that only two models
are compared, and these models must be nested. In a Bayesian setting, any number of non-
nested models may be compared.
The general computing approach for BI is Markov chain Monte Carlo (MCMC)
methods. MCMC is a general method based on drawing values of unknowns from
approximate distributions and then correcting those draws to better approximate the target
posterior distribution. Gibbs sampler and the Metropolis-Hastings algorithm are the most
widely used simulation algorithms in MCMC. BUGS modeling language (Bayesian Inference
using Gibbs Sampling) is a prevailed tool to allow the computation using MCMC algorithms
for all sorts of Bayesian models, including most of the hierarchical models applied.
WinBUGS package (37) provides a flexible and simplified platform to modeling with the
BUGS programs. In particular, since specification of the full conditional densities is not
necessary in WinBUGS, small changes in program code can achieve a wide variation in
modeling options and thus facilitating sensitivity analysis and prior assumptions. In the
following, a case study is summarized to illustrate the proposed Bayesian hierarchical method
regarding model development, calibration and evaluation.
ILLUSTRATIVE EXAMPLE
In this example, a two-entity-level multilevel design (Individual severity ~ Traffic crash level
– Driver-vehicle unit level) was conducted to investigate the individual severity of driver
injury and vehicle damage in intersection crashes (see 34 for details of this study). A total of
4095 crashes occurring at signalized intersections during 2003-2005 were extracted and used
in the model. In these, 7840 driver-vehicle units were involved, resulting in an average
involvement rate of 1.91 individuals per crash.
To yield a net effect estimate of each potential factor on individual severity, a binary
dependent variable is defined by combining the severity of driver injury and vehicle damage.
In particular, for the ith driver-vehicle unit involved in the jth crash, the high severity Yij = 1 is
defined when the driver was fatal/seriously injured or the vehicle was extensively damaged,
Huang et al. 12
and low severity otherwise ( Yij = 0). One the other hand, ten covariates in the Traffic crash
level (level 2) and five in the Driver-vehicle unit level (level 1) are employed to explain the
severity variations. The factors included in the model are listed in Table 1.
(Insert Table 1 here)
A preliminary examination of potential within-crash correlation in the collected data
set identified a significant correlation between individuals involved in same multi-vehicle
crashes. In particular, in a multi-vehicle crash, if the severity of driver-vehicle unit was
identified as high severity state, then the others had a probability of 31% also to be in high
severity. On the other hand, if a driver-vehicle unit was in low severity state, then the others
had only 12% chance to be in low severity state. This significantly lower ratio clearly implies
that the correlation among the individual severities in a multi-vehicle crash may exist. Hence,
a hierarchical Binomial Logistic model (HBL) may be more appropriate in modeling the data
than ordinary Binomial Logistic model (BL).
In the model specification, only the varying-intercept model was investigated to avoid
excess complexity as the large set of covariates used. Specifically, the probability of Yij = 1 is
denoted by π ij = Pr(Yij = 1) , hence the HBL model with varying-intercept could be expressed
as,
π ij
Logit(π ij ) = log = β 0 + X iL1β1 + X Lj 2 β 2 + ε j (6)
1− π
ij
where, X iL1 is the vector of five covariates in the Driver-vehicle unit level (level 1) while X Lj 2
is the vector of ten covariates in the Traffic crash level (level 2). β 0 , β1 and β 2 are the
regression coefficients to be estimated. ε j is the disturbance term on the crash level (level 2),
introducing a random effect for the model intercept for different crashes. As a result, those
driver-vehicle units (level 1) belonging to a same group (level 2) share a same variance
component, thus resulting in a within-group covariance. ε j is assumed as a normal
distribution with mean zero and variance τ 02 . The variance of outcome ( Yij ) therefore consists
of two components: the variance of ε j ( τ 02 ) which captures the between-crash variability
(level 2), and the variance associated with logistic distribution which captures the within-crash
variability (level 1).
In the absence of reliable informative priors, uninformative priors were employed for
all regression coefficients ( β 0 , β1 and β 2 ) with Normal distributions (0, 1000), and the
variance τ 02 with Inverse-Gamma distribution (0.001, 0.001). In the model calibration using
WinBUGS package, three chains of 20,000 iterations each produced trace plots with a good
degree of mixing, and Brooks, Gelman and Rubin convergence diagnostics indicated
convergence.
To check the model adequacy, the normality assumption of ε j were assessed. In the
MCMC simulation, 200 random effects ε j were randomly sampled, and the fact that they
averaged very close to zero was reassured. Normal probability plots, revealing no strong
abnormalities, also validate the normality and exchangeability assumptions.
Huang et al. 13
τ 02 1.34
ρ= 2 = = 28.9% (7)
τ 0 + π / 3 1.34 + π 2 / 3
2
The ICC is an indicator of the magnitude of the within-crash correlation. A value of ρ close to
zero means that there is a very small variation between the different crashes, whereas a
relative large value of ρ implies a favor for hierarchical model. This means that 28.9% of
unexplained variations in individual severity were resulted from between-crash heterogeneity,
which strongly suggests the usefulness of the model specification of hierarchical structure.
To further ensure the advantage of employing HBL over BL, a BL model with the
same covariates and dataset was also be estimated to compare with the calibrated HBL model.
The BL model was obtained by dropping random effects ε j , which means ignoring the
severity correlations between driver-vehicle units within the same crashes.
For model comparison, Deviance Information Criterion (DIC), proposed by
Spiegelhalter et al. (38) is used. In complex hierarchical models where parameters may
outnumber observations, DIC provides a Bayesian measure of model complexity and fit that
can be combined to compare models of arbitrary structure. Specifically, DIC is defined as:
where D(γ ) is the deviance evaluated at the posterior means of estimated unknowns ( γ ), and
posterior mean deviance D(γ ) can be taken as a Bayesian measure of fit or “adequacy”. p D is
motivated as a complexity measure for the effective number of parameters in a model, as the
difference between D(γ ) and D(γ ) , i.e., mean deviance minus the deviance of the means. As
a generalization of Akaike Information Criterion (AIC), DIC can thus been considered as a
Bayesian measure of fit or adequacy, penalized by an additional complexity term p D . As with
AIC, models with lower DIC values are preferred.
(Insert Table 2 here)
As shown in Table 2, model comparison between HBL and BL using DIC further
strengthened the advantage of hierarchical model. Specifically, results show that D(γ ) of HBL
model (1984.5) is less than one third of that obtained in OBL model (6165.5). After penalized
by p D , the DIC value for HBL model (3067.9) is also hugely less than that in OBL model
(6191.9). This further proves that the use of crash-level random effects in HBL model can
substantially improve the model fit.
Huang et al. 14
CONCLUSION
This paper attempts to promote the use of multilevel analysis in crash prediction models. On
the one hand, it was shown that multilevel data structure exists extensively in traffic safety
because of data collection and clustering processes. For this, a 5 × T -level hierarchy was
innovatively proposed in this paper to give a general form for potential multilevel data in
traffic crash analysis. On the other hand, it was found that traditional crash prediction models,
such as wildly-used GLMs, are incapable of taking into account the heterogeneities due to
multilevel data structure. Disregarding the possible within-group correlations may lead to
production of models with unreliable parameter estimates and statistical inferences.
The proposed multilevel analysis has a great potential in traffic safety discipline. While most
previous studies ignored the multilevel structure in traffic crash data, this study suggested the
importance of accounting for the cross-group heterogeneities in yielding reliable effect
estimation for various risk factors as well as predictions for traffic safety situation in existing
or new traffic sites.
Huang et al. 15
REFERENCES
1. Jovanis, P., Chang, H., 1986. Modeling the relationship of accident to mile traveled.
Transportation Research Record 1068, 42-51.
2. Joshua, S.C., Garber, N.J., 1990. Estimating truck accidents rate and involvements using
linear and Poisson regression models. Transportation Planning and Technology 15(1), 41-
58.
3. Jones, B., Jansen, L., Mannering, F.L., 1991. Analysis of the frequency and duration of
the freeway accidents in Seattle. Accident Analysis and Prevention 23(4), 239-55.
4. Miaou, S.P., Lum, H., 1993. Modeling vehicle accidents and highway geometric design
relationships. Accident Analysis and Prevention 25(6), 689-709.
5. Mannering, F.L., Grodsky, L.L., 1995. Statistical analysis of motorcyclists’ perceived
accident risk. Accident Analysis and Prevention 27(1), 21–31.
6. Shankar, V.N., Mannering, F., 1996. An exploratory multinomial Logit analysis of single-
vehicle motorcycle accident severity. Journal of Safety Research 27(3), 183-194.
7. Mercier, C.R., Shelley, M.C., Rimkus, J., Mercier, J. M., 1997. Age and gender as
predictors of injury severity in head-on highway vehicular collisions. Transportation
Research Record 1581, 37-46.
8. Simoncic, M., 2001. Road fatalities in Slovenia involving a pedestrian, cyclist or
motorcyclist and a car. Accident Analysis and Prevention 33(2), 147-156.
9. Al-Ghamdi, A.S., 2002. Using logistic regression to estimate the influence of accident
factors on accident severity. Accident Analysis and Prevention 34(6), 729-741.
10. O’Donnell, C.J., Connor, D.H., 1996. Predicting the severity of motor vehicle accident
injuries using models of ordered multiple choice. Accident Analysis and Prevention 28(6),
739-753.
11. Quddus, M.A., Noland, R.B., Chin, H.C., 2002. An analysis of motorcycle injury and
vehicle damage severity using ordered probit models. Journal of Safety Research 33(4),
445-462.
12. Rifaat, S.M., Chin, H.C., 2005. Analysis of severity of single-vehicle crashes in
Singapore. In: TRB 2005 Annual Meeting CD-ROM, Transportation Research Board,
National Research Council, Washington D.C.
13. Abdel-Aty, M., Keller, J., 2005. Exploring the overall and specific crash severity levels at
signalized intersections. Accident Analysis and Prevention 37(3), 417-425.
14. Miaou, S.P., 1994. The relationship between truck accidents and geometric design of road
section: Poisson versus negative binomial regression. Accident Analysis and Prevention
26(4), 471-482.
15. Shankar, V. N., Mannering, F. L., Barfield, W., 1995. Effect of roadway geometric and
environmental factors on rural freeway accident frequencies. Accident Analysis and
Prevention 27 (3), 371-389.
16. Kulmala, R., 1995. Safety at rural three- and four-arm junctions: development and
application of accident prediction models. Technical Research Center at Finland, VTT
Publications, Espoo.
17. Poch, M., Mannering, F. L., 1996. Negative binomial analysis of intersection accident
frequencies. Journal of Transportation Engineering 122(2), 105-113.
18. Abdel-Aty, M., Radwan, E., 2000. Modeling traffic accident occurrence and involvement.
Accident Analysis and Prevention 32(5), 633-642.
19. Lord, D., Miranda-Moreno, L.F., 2007. Effects of low sample mean values and small
sample size on the estimation of the fixed dispersion parameter of Poisson-gamma models
for modeling motor vehicle crashes: a Bayesian perspective. Safety Science, in press.
Huang et al. 16
20. Mussone, L., Ferrari, A. and Oneta, M., 1999. An analysis of urban collisions using an
artificial intelligence model. Accident Analysis and Prevention 31(6), 705-718.
21. Abdelwahab, H.T., Abdel-Aty, M.A., 2001. Development of artificial neural network
models to predict driver injury severity in traffic accidents at signalized intersections.
Transportation Research Record 1746, 6-13.
22. Riviere, C., Lauret, P., Ramsamy, J.F.M. and Page, Y., 2006. A Bayesian neural network
approach to estimating the energy equivalent speed. Accident Analysis and Prevention
38(2), 248-259.
23. Xie, Y., Lord, D. and Zhang, Y., 2007. Predicting Motor Vehicle Collisions Using
Bayesian Neural Network Models: An Empirical Analysis. Accident Analysis &
Prevention, in press.
24. Abdel-Aty M.A. Abdalla M.F, 2004, Linking roadway geometrics and real-time traffic
characteristics to model daytime freeway crashes: generalized estimating equations for
correlated data. Transportation Research Record 1879, 106-115.
25. Lord, D. and Persaud, B.N., 2000. Accident prediction models with and without trend:
application of the generalized estimating equations procedure. Transportation Research
Record 1717, 102-108.
26. Gelman, A., Hill, J., 2007. Data Analysis Using Regression and Multilevel/Hierarchical
Models. Cambridge University Press.
27. Shankar, V.N., Albin, R.B., Milton, J.C., Mannering, F.L., 1998. Evaluation of median
crossover likelihoods with clustered accident counts: an empirical inquiry using the
random effect negative binomial model. Transportation Research Record 1635, 44-48.
28. Jones, A.P., Jorgensen, S.H., 2003. The use of multilevel models for the prediction of
road accident outcomes. Accident Analysis and Prevention 35(1), 59-69.
29. Mitra, S., Washington, S., 2007. On the nature of over-dispersion in motor vehicle crash
prediction models. Accident Analysis and Prevention 39(3), 459-468.
30. Chin, H.C., Quddus, M.A., 2003. Applying the random effect negative binomial model to
examine traffic accident occurrence at signalized intersections. Accident Analysis and
Prevention 35(2), 253-259.
31. Yang, C. MacNab, 2003. A Bayesian hierarchical model for accident and injury
surveillance. Accident Analysis and Prevention 35(1), 91-102.
32. Kim, D.G., Lee, Y., Washington, S., Choi, K., 2007. Modeling crash outcome
probabilities at rural intersections: application of hierarchical binomial logistic models.
Accident Analysis and Prevention 39(1), 125-134.
33. Lenguerrand, E., Martin, J.L., Laumon, B., 2006. Modeling the hierarchical structure of
road crash data: application to severity analysis. Accident Analysis and Prevention 38(1),
43-53.
34. Huang, H.L., Chin, H.C., Haque, M.M., 2007. Severity of driver injury and vehicle
damage in traffic crashes at intersections: A Bayesian hierarchical analysis, Accident
Analysis and Prevention, Article in press.
35. Gelman, A., Carlin, J.B., Stern, H.S., 2003. Bayesian Data Analysis, 2nd edition.
Chapman & Hall, New York.
36. Oh, J., Washington, S., 2006. Bayesian methodology incorporating expert judgment for
ranking countermeasure effectiveness under uncertainty: example applied to at grade
railroad crossings in Korea. Accident Analysis and Prevention 38 (2), 346-356.
37. Spiegelhalter, D.J., Thomas, A., Best, N.G., Lunn, D., 2003. WinBUGS version 1.4.1
User Manual. MRC Biostatistics Unit, Cambridge, UK.
Huang et al. 17
38. Speigelhalter, D. J., Best, N. G., Carlin, B. P., Linde, V. D. 2003. Bayesian measures of
model complexity and fit (with discussion). Journal of the Royal Statistical Society,
Series B, 64(4), 583-616.
Huang et al. 18