A Joint Econometric Analysis of Seat Belt Use and CrashRelated Injury Severity
Naveen Eluru
The University of Texas at Austin
Dept of Civil, Architectural & Environmental Engineering
1 University Station C1761, Austin TX 787120278
Phone: 5124714535, Fax: 5124758744
Email: naveeneluru@mail.utexas.edu
and
Chandra R. Bhat*
The University of Texas at Austin
Dept of Civil, Architectural & Environmental Engineering
1 University Station C1761, Austin TX 787120278
Phone: 5124714535, Fax: 5124758744
Email: bhat@mail.utexas.edu
* corresponding author
ABSTRACT
This paper formulates a comprehensive econometric structure that recognizes two important issues in crashrelated injury severity analysis. First, the impact of a factor on injury severity may be moderated by various observed and unobserved variables specific to an individual or to a crash. Second, seat belt use is likely to be endogenous to injury severity. That is, it is possible that intrinsically unsafe drivers do not wear seat belts and are the ones likely to be involved in high injury severity crashes because of their unsafe driving habits.
The preceding issues are considered in the current research effort through the development of a comprehensive model of seat belt use and injury severity that takes the form of a joint correlated randomcoefficients binaryordered response system. To our knowledge, this is the first instance of such a model formulation and application not only in the safety analysis literature, but in the econometrics literature in general. The empirical analysis is based on the 2003 General Estimates System (GES) data base. Several types of variables are considered to explain seat belt use and injury severity levels, including driver characteristics, vehicle characteristics, roadway design attributes, environmental factors, and crash characteristics. The results, in addition to confirming the effects of various explanatory variables, also highlight the importance of (a) considering the moderating effects of unobserved individual/crashrelated factors on the determinants of injury severity and (b) seat belt use endogeneity. From a policy standpoint, the results suggest that seat belt nonusers, when apprehended in the act, should perhaps be subjected to both a fine (to increase the chances that they wear seat belts) as well as mandatory enrollment in a defensive driving course (to attempt to change their aggressive driving behaviors).
Keywords: seat belt use, crash injury severity, random coefficients, selective recruitment, discrete choice models with endogeneity.
1. INTRODUCTION
Traffic crashes result in several fatalities everyday on U.S. roadways, and those who manage to survive crashes are faced with such potential consequences as mental trauma, pain, expensive medical costs, and increased insurance premiums (Cohen and Einav, 2003; Chang and Mannering, 1999). The society as a whole is also at a loss, both economically and emotionally, because of these incidents.
The injury severity sustained by individuals in traffic crashes is influenced by a multitude of factors, including vehicle characteristics, roadway design characteristics, driver behavior and physiological characteristics, angle of collision, driver use of alcohol or drugs, and driver use of restraint systems. It is essential to quantify the relative magnitudes of the impact of these factors on accident severity, so that measures to prevent or reduce accident severity can be identified and implemented. The current study contributes toward this end by formulating, and estimating, a comprehensive model of injury severity.
The methodology in the paper recognizes two important econometric issues in safety analysis. First, the impact of a factor on injury severity may be moderated by various observed and unobserved variables specific to an individual or to a crash. For instance, the effectiveness of seat belt use in reducing injury severity may be higher for teenagers with their relatively unconventional driving styles. This is a case of age, an attribute available in crash data bases, impacting the influence of seat belt use on injury severity. In a similar vein, the physical frame or precise sitting posture of an individual may have an association with seat belt effectiveness. This is an instance where unobserved characteristics (physical frame and sitting posture) moderate the effectiveness of seat belt use in reducing injury severity. In general, one could argue that there are several subtle, unobserved, characteristics that moderate the effect of factors influencing injury severity. Ignoring such unobserved heterogeneity can, and in general will, result in inconsistent estimates in nonlinear models (see Chamberlain, 1980; Bhat, 2001). The second issue addressed in our econometric framework is the endogeneity of seat belt use to injury severity (also referred to as selective recruitment in the safety analysis literature; see Evans, 1996 and Derrig et al., 2000). That is, seat belt nonusers may be intrinsically unsafe drivers. For example, Janssen (1994) provides empirical evidence that seat belt nonusers drive significantly faster than seat belt users, after controlling for gender, age, annual kilometerage, and number of years of possession of driver’s license. Petridou and Moustaki (2000) and Shinar (1993) also allude to this risk taking behavior of seat belt nonusers. Thus, it is quite likely that seat belt nonusers are the ones likely to be involved in high injury severity crashes because of their unsafe driving habits. If this sample selection is ignored (as has been done in several previous studies), the result is an artificially inflated estimate of the effectiveness of the seat belt use.
The methodology used in the paper to address the two econometric issues discussed above takes the form of a joint binary logitordered logit structure with random coefficients. In particular, seat belt use is modeled using a binary logit structure, while injury severity is modeled using an ordered response structure. A host of driver characteristics, vehicle characteristics, roadway design attributes, environmental characteristics, and crash characteristics, and the interactions of these characteristics, are considered in the joint model. The moderating influence of unobserved factors associated with the impact of these attributes is accommodated by imposing a random coefficients structure in the ordered logit model. The potential self selection in seat belt use based on injury severity propensity is considered by tying the binary seat belt use component and the ordered response injury severity component of the joint model through a common unobserved random term. The joint model is subsequently applied in an empirical analysis that uses data from the 2003 General Estimates System (GES), a nationally representative sample of policereported crashes of all types in the U.S.
The rest of this paper is structured as follows. The next section provides an overview of the crashrelated injury severity modeling literature, and positions the current study in the context of earlier studies. Section 3 presents the econometric framework. Section 4 discusses the data source and sample used in the empirical analysis. Section 5 presents the empirical results. Section 6 concludes the paper by summarizing the important findings and identifying policy implications.
2. EARLIER STUDIES
Crash injury severity has been extensively researched in the safety analysis literature. Section 2.1 reviews earlier injury severity studies that do not consider seat belt use as an endogenous variable, while Section 2.2 discusses earlier research studies that either model, or extensively discuss the need to consider, seat belt use as an endogenous variable in analyzing injury severity. Section 2.3 provides a summary and positions the current research.
2.1 Seat Belt Use Not Considered Endogenous to the Modeling Framework
A number of studies have examined crashrelated injury severity, while considering seat belt use and several other attributes as exogenous variables. Most of these injury severity studies undertake the analysis at the level of individual accidents, rather than using an aggregatelevel dependent variable such as the number of annual accidents in a county or state (but see Lourens et al., 1999; Doherty et al., 1998; and Derrig et al., 2000 for examples of aggregatelevel studies). The reason for using a disaggregatelevel analysis (i.e., an analysis at the level of individual accidents) is that it better captures the fundamental relationship between accident severity and its determinants, rather than capturing spurious correlations from ignoring the heterogeneity of accidents in an aggregatelevel analysis (see Kassoff and Deutschman, 1969 for an extensive discussion). Within the group of disaggregatelevel injury severity studies, the early research efforts (those before 2000) applied frameworks such as loglinear analysis (Golob et al., 1986; Kim et al., 1994; AbdelAty et al., 1998), factorial analysis (see McLellan et al., 1996) and descriptive analysis (Evans, 1990; Evans and Frick, 1988; Cooper, 1994; Huelke and Compton, 1995). In the past several years, however, almost all injury severity studies have used a discrete variable framework because accident reports collect injury severity in discrete categories.
The discrete variable studies of crashrelated injury severity have used one or more of the following five categories of variables: (1) Driver attributes (including demographics and such behavioral characteristics as seat belt use and drug/alcohol use), (2) Characteristics of the vehicle(s) involved in the crash (vehicle weight and type of vehicle(s)), (3) Roadway design attributes (number of lanes, grade, alignment, presence of shoulders, lane widths and speed limits), (4) Environmental factors (weather, lighting conditions, time of day, etc.), and (5) Crash characteristics (manner of collision, role of vehicle in crash, whether there was a rollover of one or more vehicles, whether driver was ejected, etc.). A review of the earlier discrete choice studies of injury severity, and the categories of variables considered in each study, is presented in Table 1.
Three important observations may be made from Table 1. First, except for the studies by Chang and Mannering (1999) and Ulfarsson and Mannering (2004), none of the earlier studies has comprehensively considered all the five categories of variables. Second, the two most prevalent structures used to examine injury severity are logistic regression models and orderedresponse models. The logistic regression models are binary logit models that focus on whether or not there is a severe injury associated with a crash (severe injury is defined either as a fatality or some other severe characterization of injury). The orderedresponse models consider the entire range of injury severity levels and, therefore, capture and provide more injury severity information (relative to the logistic regression models). The orderedresponse models used in the past for injury severity analysis take the form of either an orderedresponse logit or an orderedresponse probit structure. Both these ordinal model forms are essentially equivalent, and differ only in whether a logistic or a normal distribution is used for the stochastic component in the latent propensity that is assumed to underlie the observed injury severity.^{1} Third, none of the existing studies allow randomness in the effects of injury severity determinants due to the moderating influence of unobserved factors. Srinivasan (2002) allows randomness due to unobserved factors in the threshold bounds that relate the underlying latent injury severity propensity to the observed injury severity categories, but does not address the randomness in the effects of injury severity determinants. Of course, none of the studies in Table 1 also consider seat belt as being endogenous in their modeling frameworks.
2.2 Seat Belt Endogenous to the Modeling Framework
A number of earlier studies have alluded to the “selective recruitment” of seat belt nonusers in crashes involving severe injuries. One of the early studies that discusses the selective recruitment (or sample selection) issue conceptually is Evans (1985). However, the first empirical validation of the sample selection hypothesis appears to have been undertaken by Evans (1996), who used a probability sample of policereported crashes in the U.S. between 19821991 from the National Accident Sampling System (NASS) to examine the relationship between crash severity and seat belt use. Evans measured crash severity in terms of the change in velocity due to the crash, which itself was inferred using structural equations based on the level of vehicle deformation in the crash. Evans’ results indicated an overrepresentation of unbelted drivers in high crash severity accidents. To the extent that crash severity level is correlated with injury severity level, Evans’ results provide evidence that unbelted drivers are intrinsically more likely to be involved in high injury severity crashes. Evans concludes that seat belt effectiveness is overestimated by a large amount if the sample selection is not accounted for.
Another study that indirectly provides support for the sample selection hypothesis is Dee (1997), who examined why seat belt laws that increased seat belt usage sharply in the late 1980s and early 1990s had a relatively small impact on crashrelated fatalities. One of the hypotheses he considered to explain this apparent paradox was that of sample selection. That is, unsafe drivers are more likely than the general population to continue not to wear seat belts even after passage of seat belt laws. If such unsafe drivers are also more likely to be involved in severe crashes, the net result would only be a small impact on crashrelated fatalities. To test the hypothesis, Dee used the Center for Disease Control and Prevention’s (CDC) annual Behavioral Risk Factor Surveillance System (BRFSS) telephone surveys collected between 19851993. Dee compared the reported seat belt usage of crashprone individuals and the general population after the passage of seat belt laws. His analysis provides evidence that crashprone individuals are more likely not to wear seat belts than the general population after the enactment of seat belt laws, a finding consistent with the sample selection hypothesis.
Cohen and Einav (2003) examined the impact of seat belt usage on crashrelated vehicle occupant fatalities using data from the Fatality Analysis Reporting System (FARS) collected between 1983 and 1997. The FARS data on traffic fatalities were aggregated to obtain the total number of annual fatalities by U.S. state. The authors then used a loglinear regression model to relate the logarithm of the number of occupant fatalities per vehicle mile of travel in each state to (1) the seat belt usage rate in the state (2) a set of demographic, traffic density, crime and fuel tax rate control variables in the state, (3) fixed state effects to control for the potential endogeneity of usage rate (for example, states with high crash related fatalities may institute enforcement strategies that influence usage rates) and (4) fixed year effects. In addition, to address endogeneity of seat belt usage rates, the authors instrumented the usage rate through variables related to mandatory seat belt laws. The overall finding from this aggregate level analysis is that ignoring seat belt usage rate endogeneity leads to a substantial bias in the effect of seat belt usage rate on the logarithm of percapita vehicle occupant fatalities.
It is interesting that the three sample selection studies discussed above have been based on a simple univariate descriptive analysis (Evans, 1996), or a simple examination of seat belt usage between predefined accident prone groups and the general population (Dee, 1997), or an aggregate level analysis that can mask heterogeneity in crash outcomes and characteristics (Cohen and Einav, 2003).
2.3 Summary of Earlier Studies and the Current Research
The overview of the literature indicates the substantial earlier research on crashrelated injury severity determinants. Increasingly, the methodology of choice for modeling injury severity is the orderedresponse framework, which recognizes the ordinal nature of injury severity in policereported accidents. However, the orderedresponse models need to be enhanced to (1) comprehensively consider interactions among groups of potential determinants of injury severity, (2) allow randomness in the effects of injury severity determinants due to the moderating influence of unobserved factors, (3) recognize the potential, and very likely, endogeneity of seatbelt use in injury severity modeling, and (4) accommodate the potential randomness in the effect of seat belt use on injury severity. It is indeed surprising, in particular, that there have been very few studies to date that recognize the potential endogeneity of seat belt use. The handful of studies that do so are focused toward testing the selective recruitment hypothesis using univariate, descriptive, and aggregate analyses, rather than the multivariate, methodologically rigorous, and disaggregate discrete choice framework adopted by the studies that do not consider seat belt endogeneity.
In this paper, we bring the two streams of earlier work (those that do not consider seat belt endogeneity and those that do) together by developing a comprehensive, multivariate, methodologically rigorous, and disaggregatelevel model of seat belt use and injury severity that takes the form of a joint correlated randomcoefficients binaryordered response system. This joint system is formulated as a mixing model that conveniently, and at once, considers all the issues of (1) systematic interaction effects among variables, (2) random unobserved effects in the influence of injury severity determinants, (3) potential endogeneity of seat belt use in modeling injury severity level, and (4) random variations in seat belt use effectiveness. To our knowledge, this is the first instance of such a model formulation and application not only in the safety analysis literature, but in the econometrics literature in general. In addition to the methodological considerations, we consider a comprehensive set of potential determinants of injury severity in the empirical analysis. The focus in the analysis is exclusively on driver injury severity (as opposed to the injury severity of other vehicle occupants).
3. ECONOMETRIC FRAMEWORK
3.1 Model Structure
Let q (q = 1, 2, …, Q) be an index to represent drivers and let k (k = 1, 2, 3, …, K) be an index to represent injury severity. The index k, for example, may take values of “no injury” (k = 1), “possible injury” (k = 2), “nonincapacitating injury” (k = 3), “incapacitating injury” (k = 4), and “fatal injury” (k = 5), as in the empirical analysis in the current paper. The equation system for the joint driver seat belt use and injury severity model is:
, if ; otherwise
, if (1)
The first equation is associated with the latent propensity of seat belt use for driver q. is the actual observed seat belt use by driver q, and is an (M x 1)column vector of attributes (including a constant) associated with driver q (for example, sex, age, soberness status, etc.) and driver q’s trip environment (for example, roadway speed limits, timeofday, etc.). represents a corresponding (M x 1)column vector of mean effects of the elements of on seat belt use propensity, while is another (M x 1) column vector with its m^{th} element representing unobserved factors specific to driver q and her/his trip environment that moderate the influence of the corresponding m^{th} element of the vector . captures common unobserved factors influencing driver q’s seat belt use propensity and the driver’s injury severity propensity (for instance, an intrinsically cautious and responsible driver is likely to wear seat belts and drive defensively, incurring less severe injuries in crashes). is an idiosyncratic random error term assumed to be identically and independently standard logistic distributed across individuals q.
The second equation is associated with the latent propensity associated with the injury severity sustained by driver q in the accident. This latent propensity is mapped to the actual injury severity level by the thresholds ( and) in the usual orderedresponse fashion. is an (L x 1) column vector of attributes (not including a constant and not including seat belt use) that influences the propensity associated with injury severity. is a corresponding (L x 1)column vector of mean effects, and is another (L x 1)column vector of unobserved factors moderating the influence of attributes in on the injury severity propensity for driver q. is a scalar constant, is a set of driver/crash attributes that moderate the effect of seat belt use on injury severity, and is a corresponding vector of coefficients. is an unobserved component influencing the impact of seat belt effectiveness for driver q, and is an idiosyncratic random error term assumed to be identically and independently standard logistic distributed across individuals q.
The sign in front of in the injury severity equation indicates that the correlation in unobserved factors between seat belt use and injury severity may be positive or negative. A positive sign implies that drivers who use seat belts are intrinsically more likely to incur severe injuries in crashes, while a negative sign implies that drivers who use seat belts are intrinsically less likely to incur severe injuries in accidents. Clearly, we expect, from an intuitive standpoint, that the latter case will hold. However, one can empirically test the models with both ‘+’ and ‘−’ signs to determine the best empirical result. Of course, if the correlation between the seat belt use and injury severity propensities is ignored, when actually present, it results in a “corrupt” estimation of the effectiveness of seat belt use in reducing injury severity. More specifically, if the unobserved correlation between seat belt use and injury severity propensities is negative, as we expect, ignoring this correlation would result in an inflated effectiveness of seat belt use in reducing injury severity.
To complete the model structure of the system in Equation (1), we need to specify the structure for the unobserved vectors and , and the unobserved scalars and . In the current paper, we assume that the and elements, and and , are independent realizations from normal population distributions; , , , and .
3.2 Model Estimation
The parameters to be estimated in the joint model system of Equation (1) are the , and vectors, the scalar, the thresholds, and the following variance terms: , , , and . Let represent a vector that includes all these parameters to be estimated. Also, let be a vector that vertically stacks the and vectors, and the and scalars. Let be another vertically stacked vector of standard errors , , , and , and let represent a vector of all parameters except the standard error terms. Finally, let . Then, the likelihood function, for a given value of and error vector , may be written for driver q as:
(2)
where G(.) is the cumulative distribution of the standard logistic distribution and is a dummy variable taking the value 1 if driver q sustains an injury of level k and 0 otherwise. Finally, the unconditional likelihood function can be computed for driver q as:
, (3)
where F is the multidimensional cumulative normal distribution. The loglikelihood function is
. (4)
The likelihood function in Equation (3) involves the evaluation of a multidimensional integral of size equal to the number of rows in . This multidimensional integration cannot be accomplished using general purpose numerical methods such as quadrature, since quadrature techniques cannot evaluate the integrals with sufficient precision and speed for estimation via maximum likelihood (see Hajivassiliou and Ruud, 1994).
We apply simulation techniques to approximate the integrals in the likelihood function and maximize the logarithm of the resulting simulated likelihood function across individuals with respect to . The simulation technique approximates the likelihood function in Equation (4) by computing the for each q at different realizations of drawn from a multivariate normal distribution, and computing the individual likelihood function by averaging over the different values of the integrand across the different realizations. Notationally, if is the realization of the likelihood function in the h^{th} draw (h = 1, 2, …, H), then the individual likelihood function is approximated as:
, (5)
where is the simulated likelihood function for the q^{th} observation, given the parameter vector . is an unbiased estimate of the actual likelihood function . Its variance decreases as H increases. It also has the appealing properties of being smooth (i.e., twice differentiable) and being strictly positive for any realization of draws.
The simulated loglikelihood is constructed as:
. (6)
The parameter vector is estimated as the value that maximizes the above simulated function. Under rather weak regularity conditions, the maximum (log) simulated likelihood (MSL) estimator is consistent, asymptotically efficient, and asymptotically normal (see Hajivassiliou and Ruud, 1994; Lee 1992).
In the current paper, we use a quasiMonte Carlo (QMC) method proposed by Bhat (2001) for discrete choice models to draw realizations for from its population multivariate distribution. QMC methods are similar to the familiar Monte Carlo method in that they evaluate a multidimensional integral by replacing it with an average of values of the integrand computed at discrete points (see Equation 5). However, rather than using pseudorandom sequences for the discrete points, the QMC approach uses “cleverly” crafted nonrandom and more uniformly distributed sequences (labeled as QMC sequences) within the domain of integration. The underlying idea of the QMC methods is that it is really inconsequential whether the discrete points are truly random; of primary importance is the even distribution (or maximal spread) of the points in the integration space. Within the broad framework of QMC sequences, we specifically use the Halton sequence in the current analysis.
4. DATA
4.1 Data Source
The data source used in this study is the 2003 General Estimates System (GES) obtained from the National Highway Traffic Safety Administration’s National Center for Statistics and Analysis. The GES consists of data compiled from a sample of policereported accidents that involve at least one motor vehicle traveling on a traffic way and resulting in property damage, injury, or death. The GES data are drawn from accidents in about 60 areas across the U.S. that reflect the geography, population, and traffic density of the U.S. (the reader is referred to ftp://ftp.nhtsa.dot.gov/GES/GES03/SAS for comprehensive details of how the accident reports are collected and compiled). The 2003 GES includes information regarding 60,000 accidents involving about 150,000 individuals and 100,000 vehicles.
A number of accidentrelated attributes are collected for each accident in the GES, including the characteristics of the drivers involved, vehicle characteristics, roadway design attributes, environment attributes, and crash characteristics. The injury severity of each individual involved in the accident is collected on a five point ordinal scale: (1) No injury, (2) possible injury, (3) Nonincapacitating injury, (4) Incapacitating injury, and (5) Fatal injury.
4.2 Sample Description
In the current analysis, we examine seat belt usage and injury severity of drivers of passenger vehicles. The focus on drivers is because seat belt usage data is better recorded for drivers than for nondrivers. We also confined our attention to noncommercial drivers because of potential systematic differences between commercial and noncommercial drivers (commercial drivers are professionally trained and have to follow companyrelated and insurancerelated driving protocols). Finally, our analysis is confined to crashes (accidents involving collision with a fixed object or other vehicles rather than noncollision accidents such as rolling over) and further to the vast majority of crashes in which one or two vehicles are involved.
The final data sample of noncommercial driver crashes consisted of about 50,000 records. Of these, 11,388 records were sampled so that the distribution of injury severity in this smaller sample was about the same as the weighted distribution of injury severity in the full sample of about 50,000 records (The weighted full GES dataset is intended to replicate the overall national statistics of crashes and injury severity). The seatbelt use in the weighted sample is as follows: used seat belts (93.2%) and did not use seat belts (6.8%).^{2} The distribution of injury severity across the observations and by seat belt use is provided in Table 2. Clearly, the table shows a negative association between seat belt use and injury severity. One of the issues to be addressed in this research is to estimate how much of the association is due to “true” seat belt use effectiveness and how much is due to “spurious” effects.
5. EMPIRICAL ANALYSIS
5.1 Variables Considered
Several types of variables were considered in the empirical analysis, including driver characteristics, vehicle characteristics, roadway design attributes, environmental factors, and crash characteristics.
Driver characteristics included driver demographics (age and sex) and driver alcohol use^{3}. The only vehicle characteristics included in the current study are the vehicle types involved in the crash (the vehicle types include passenger cars, sports utility vehicles, pick up trucks, and minivans). Other vehicle characteristics, such as vehicle weight, vehicle speed just before impact, and seating configuration, are either not available in, or missing for a large fraction of, the GES data. The roadway design attributes considered in the analysis are speed limit and roadway functional class (whether the accident occurred on an interstate highway, or arterial, or other roads). Again, additional roadway design attributes, such as number of lanes, alignment of roads, and grade and shoulder widths, could not be included because of the absence of data, or the large fraction of missing data, on these variables in the GES. Environmental factors related to the crash that were considered included day of the week, time of day^{4}, lighting conditions (dawn, daylight, dusk, dark, and dark and lit), and weather conditions (no adverse weather, rain, snow, and fog). Finally, the crash characteristics included whether or not the person was ejected from the vehicle, if the vehicle rolled over, whether the crash was with a stationary object or another vehicle, and the manner of collision in crashes with another vehicle (headon, rear end, angle, sideswipe when traveling in the same direction, and sideswipe when traveling in opposite directions), and the role of the driver’s vehicle in crashes with another vehicle (i.e., whether the driver’s vehicle struck the other vehicle, or the driver’s vehicle was struck by the other vehicle, or both vehicles struck each other).
In addition to the five groups of variable discussed above, we also considered several interaction effects among the variables in both the seat belt use and injury severity model. The final specification was based on a systematic process of removing statistically insignificant variable and combining variables when their effects were not significantly different. The specification process was also guided by prior research and intuitiveness/parsimony considerations. We should also note here that, for the continuous variables in the data (such as age and speed limits), we tested alternative functional forms that included a linear form, a spline (or piecewise linear) from, and dummy variables for different ranges.
5.2 Model Specification
We estimated three different models in the research effort: (1) a simple binary choice logit for seat belt use and an independent ordered response logit for injury severity, which we will refer to as the independent binary ordered (IBO) model, (2) a random coefficients binary choice logit for seat belt use and an independent random coefficients ordered response logit for injury severity, which we will refer to as the independent random binaryordered model (IRBO), and (3) a random coefficients binary choice logit for seat belt use and a correlated random coefficients ordered response logit for injury severity, which we will label as the correlated random binaryordered (CRBO) model. In the context of the model formulation in Section 3.1, the IBO model imposes the assumptions that for all m, for all l, and . The IRBO model imposes the assumption that .
The final specifications of the randomcoefficients in the seat belt use and injury severity components of the IRBO and the CRBO models were obtained after extensive testing. In the following presentation of empirical results, we will discuss only the CRBO models for the sake of presentation ease. However, we will use the IBO and IRBO models as yardsticks to evaluate the performance of the CRBO model.
5.3 Estimation Results
5.3.1 Seat Belt Use Component
Table 3 provides the results of the seat belt use component of the CRBO model (the coefficients represent the effects of the variables on the latent propensity to wear seat belts).
The specific effects of the driver characteristics indicate that men, younger individuals (Age < 25 years), and those driving under the influence of alcohol are less likely to use seatbelts compared to women, older individuals (Age ≥ 25 years) and those not driving under the influence of alcohol, respectively (these results are consistent with earlier seat belt use studies; for example, see Reinfurt et al., 1996 and Preusser et al., 1991).^{5} The effects of the vehicle characteristics indicate that individuals driving a pickup are the least likely ones to wear a seat belt, while sports utility vehicle (SUV) drivers are the most likely to wear seat belts. This association between vehicle type and seat beltuse is perhaps the manifestation of the link between safety consciousness and type of vehicle owned. Finally, the time of day variables suggest that drivers are more likely to wear seat belts during the midday (9am3pm) and PM peak periods (3pm7pm) than the early morning (12am6am), AM peak (6am9am) and evening (7pm12am) periods. The higher nonuse of seat belts during the early morning and evening periods may be the result of fewer lawenforcement officials on the streets during these times.
5.3.2 Injury Severity Component
Table 4 presents the results of the injury severity component of the CRBO model (the parameters indicate the effects of variables on the latent propensity associated with injury severity). The results are discussed by variable group.
5.3.2.1 Driver Characteristics The impact of driver characteristics show significant variations based on demographics and alcohol influence. In particular, men and young adults (< 25 years of age) are less likely to sustain severe injuries relative to women and older adults, respectively, a result also observed in earlier studies of injury severity (see, for example, O’Donnell and Connor, 1996; Kim et al., 1994; and Srinivasan, 2002).^{6} The likelihood of being injured severely is highest for women over 74 years of age, while the likelihood of not being injured severely is highest for men younger than 25 years of age. Consistent with the findings from earlier studies and intuition, drivers under the influence of alcohol are likely to be more severely injured than those who are sober.
5.3.2.2 Vehicle Characteristics The type of the driver’s vehicle as well as the vehicle type of the other vehicle involved in dualvehicle crashes were considered in the injury severity component of the joint model. In addition to main effects, combinations of the driver vehicle type and the other vehicle type, and interactions of vehicle type with all the four other variable groups, were considered. The final specification, however, comprised only three variables related to vehicle type (see Table 4). The results show that drivers in sedans are likely to be injured more severely in crashes compared to drivers in other vehicle types (SUVs, pickup trucks, and minivans). This is particularly the case in the presence of snow and/or fog, and in crashes where the driver’s sedan is struck by a nonsedan.
5.3.2.3 Roadway Attributes The only roadway design attributes considered in the current analysis are speed limit and roadway functional class (and interactions of the two). However, once speed limit was controlled for, roadway functional class did not have any additional significant effects, because of the strong correlation between speed limits and roadway functional class. The results indicate that, on average, driver injury tends to be most severe for crashes on mediumtohigh speed limit roads (2664 mph). Also, driver injury from crashes tends to be, on average, more severe on high speed limit (≥ 65 mph) roads relative to low speed limit (≤ 25 mph) roads. These patterns may be a reflection of two issues. First, vehicle speeds are higher on the roads with high speed limits, because of which injury severity is higher on such roads. Second, there are several design factors associated with roads with very high speed limits (≥ 65 mph) that can temper the seriousness of injuries from a crash. For instance, roads with high speed limits have wider lanes, more lanes, and wide shoulders that may present drivers the opportunity to take last minute evasive measures to reduce injury severity. It is also interesting, however, to note the wide variation in injury severity propensity across crashes on high speed roads (note the large standard deviation relative to the mean on the high speed limit coefficient in the table). This indicates that, while injury severity on high speed roads may be low for some crashes because of roadway characteristics, it can also be very high for some crashes because of the vehicle speed.
5.3.2.4 Environmental Factors The results associated with environmental factors indicate that crashes occurring during the day (6am7pm) tend to be less severe than those occurring during other times of the day. This may be because of higher traffic volumes on the roads during the day, resulting in drivers being more alert as well as traveling at slower speeds (note that this lower injury severity crashes during the day cannot be attributed to higher seat belt use, since seat belt use is included as a variable in the analysis; see Section 5.3.2.7).
The remaining parameters characterizing the effects of the crash environment in Table 4 suggest lower injury severity levels in dusk or dark lighting conditions (relative to dawn, daylight, and dark but lit lighting conditions) and under adverse weather conditions (relative to normal weather conditions). These results, while initially counter intuitive, are presumably capturing the vehicle speed effect. That is, drivers are likely to travel slower under poor lighting and adverse weather conditions. It is also important to note the large standard deviation on the “dark” variable, which indicates a wide dispersion in injury levels sustained under dark lighting conditions.
5.3.2.5 Crash Characteristics Several crash characteristics are strong determinants of injury severity. As expected, a driver ejected from her/his vehicle or in a vehicle that rolled over is likely to sustain severe injuries. The “crash with a stationary object” group of variables and the “manner of collision in two vehicle crashes” group of variables need to be considered together. The results indicate that, on average, crashes with a stationary large object (such as a concrete traffic barrier, post, pole, culvert, ditch, trees, etc.) and headon collisions with another vehicle are most dangerous, followed by crashes with a stationary small object (such as a fire hydrant, shrubbery, boulder, curb, guard rail, etc.) and angle collisions with another vehicle (see O’Donnell and Connor, 1996 for similar results). Rearend collisions with another vehicle are less severe than stationary object and headon/angle collisions with another vehicle, but more severe than swipe collisions with another vehicle. There is also a large standard deviation of the “angle” coefficient; the mean and the standard deviation of this coefficient imply that, in a majority of cases (88%), angle crashes are less severe than headon collisions. But, about 12% of the time, angle crashes lead to higher injury severity than headon crashes. Finally, in the set of crash characteristics, the “vehicle role in two vehicle crashes” group of variables suggests a higher injury severity level if the driver is struck, or is struck and strikes another vehicle, relative to striking another vehicle.
5.3.2.6 Seat Belt Use and Sample Selection Effect The empirical results showed no observed or unobserved differences in seat belt effectiveness across seat belt wearers in the injury severity equation for each of the IBO, IRBO and CRBO models (i.e., the elements of in Equation (1) are not significantly different from zero and the variance of is also not significantly different from zero). Table 4 shows that the remaining constant impact of seat beltuse is negative, indicating a reduced injury severity if the driver uses the seat belt. Also, the standard deviation of the common error component between the seat belt use and injury severity propensities is statistically significant. This standard deviation corresponds to the standard deviation of in Section 3.1. The term was introduced as in the injury severity component of Equation (1) because it provided a substantially better fit than introducing it as . This result lends very strong support for the selective recruitment (or sample selection) hypothesis: safety conscious drivers are more likely to wear seat belts and their defensive habits also lead to less severe injuries when they are involved in crashes. The standard deviation estimate of the common error component translates to an effective correlation of –0.46 [= –(0.93)^{2 }/ 1 + (0.93)^{2}] between the unobserved factors impacting the seat belt use and injury severity propensities.
In contrast to the correlated random binaryordered (CRBO) model presented here, the independent random binaryordered (IRBO) model ignores the selective recruitment issue; that is, it ignores the unobserved factors influencing seat belt use propensity and injury severity propensity. As a result, the safetyconscious and defensive driving habits of seatbelt users gets incorrectly manifested as an inflated effectiveness of seat belt use in reducing injury severity. In fact, the coefficient estimate on the seat belt use variable in the IRBO model is –1.39 with a corresponding tstatistic of –13.74 (compared to coefficient of –0.75 with a corresponding tstatistic of –1.88 in the CRBO model). While the coefficients between the IRBO and CRBO models are not comparable, the CRBO coefficients should be generally larger in magnitude compared to the IRBO coefficient (as is the case with all other coefficients except the seat belt use coefficient).^{7} Thus, there is clear, substantial, and incorrect inflation in seat belt use effectiveness when sample selection is ignored. We revisit this important point again in Section 5.3.4.
5.3.2.7 Threshold Parameters The threshold parameters map the injury severity latent index to the reported injury severity categories. As such, they do not have any substantive interpretation.
