A Latent Variable Representation of Count Data Models to Accommodate Spatial and Temporal Dependence: Application to Predicting Crash Frequency at Intersections
This paper proposes a reformulation of count models as a special case of generalized orderedresponse models in which a single latent continuous variable is partitioned into mutually exclusive intervals. Using this equivalent latent variable-based generalized ordered response framework for count data models, we are then able to gainfully and efficiently introduce temporal and spatial dependencies through the latent continuous variables. Our formulation also allows handling excess zeros in correlated count data, a phenomenon that is commonly found in practice. A composite marginal likelihood inference approach is used to estimate model parameters. The modeling framework is applied to predict crash frequency at urban intersections in Arlington, Texas. The sample is drawn from the Texas Department of Transportation (TxDOT) crash incident files between 2003 and 2009, resulting in 1,190 intersection-year observations. The results reveal the presence of intersection-specific time-invariant unobserved components influencing crash propensity and a spatial lag structure to characterize spatial dependence. Roadway configuration, approach roadway functional types, traffic control type, total daily entering traffic volumes and the split of volumes between approaches are all important variables in determining crash frequency at intersections.