• No se han encontrado resultados

Desarrollo de la solución propuesta: Metodología para armado de PDOA colaborativos

3.3 Etapa de Caracterización de la Situación

3.4.5 Tratamiento de la comunicación y ficha asociada

Since crashes are countable, discrete and random, the use of a linear regression model is found to be inappropriate when the frequency of crashes is considered as the dependent variable. Models based on a Poisson distribution, including Poisson and Poisson- Gamma (i.e. negative binomial) regression models, will be employed (Miaou and Lum,

81

1993; Miaou, 1994; Shankar et al., 1995; Poch and Mannering, 1996; Shankar et al., 1998; Abdel-Aty and Radwan, 2000; Lord and Mannering, 2010).

The most suitable model for crash data appears to be a Poisson regression model that assumes a Poisson distribution (Jovanis and Chang, 1986; Shankar et al., 1995; Ivan et al., 1999; Miaou and Lum, 1993; Miaou, 1994; Kim et al., 2007; Thakuriah and Cottrill, 2008).

A Poisson regression model is expressed as:

( )

~ Poisson i i Y µ (4.10)

( )µ

i

X

i

log

(4.11) Where

Yi is the observed number of accidents occurred on area-wide i; μi is the expected Poisson accident rate at an area-wide i;

α is the intercept;

Xi is the vector of explanatory variables for area-wide i;

β is the vector of coefficients to be estimated.

However, Poisson regression models are not always appropriate for modelling traffic crash occurrences (count) and are not without limitations. The important limitation is that such a model assumes that the mean of crash counts must be equal to the variance of the crash count i.e. equidispersion (Jones et al., 1991; Miaou et al., 1992; Kulmala, 1994; Shankar et al., 1995; Lord and Mannering, 2010). If this assumption does not hold for the case of dispersed (either over- or under-dispersed) crash data, a Poisson Regression model produces undesirable results (Miaou, 1994; Shankar et al., 1995; Vogt and Bared, 1998).

To address the problem of over- and under-dispersion often found in crash data, researchers employ a negative binomial (NB) model which allows for over-dispersion (Miaou, 1994; Kulmala, 1994; Poch and Mannering, 1996; Milton and Mannering,

82

1998; Abdel-Aty and Radwan, 2000; Lord, 2000; Ivan et al., 2000). A negative binomial model is given by:

(4.12) Where:

Yi is the total number of observed traffic crashes recorded in a spatial unit i at a given

time period;

β0 is the intercept term;

is the vector of explanatory variables; is the vector of parameters to be estimated;

is a random term capturing heterogeneity effects for spatial unit i;

and exp( ) follows a gamma distribution with mean 1 and variance in which k is known as an over-dispersion parameter.

The model presented in equation 4.12 can be estimated using the maximum likelihood method (Cameron and Trivedi, 1998). Both log-likelihood ratio (i.e. pseudo R2) and Akaike information criterion (AIC) can be employed to measure the model goodness- of-fit (GoF).

4.4 Summary

This chapter has shown the statistical methods for modelling crash severity and frequency which will be followed in this thesis. It has also provided a detailed discussion on the statistical models to be used in severity and frequency crash analysis. For modelling the severity of crashes, an ordered response model and a nominal (unordered) response model have been detailed and considered to determine whether they are suitable for categorical data. As for modelling the frequency of crashes, classical count outcome models have been details and considered.

The econometric models described in this chapter will be used to analyse the data that are described in the next chapter (Chapter 5).

83

5

DATA DESCRIPTION

5.1 Introduction

In order to accomplish the aim and objectives of this research, the analysis of the crash dataset and road network, population and land-use data was required, in order to reveal the factors affecting the severity and frequency of road traffic crashes in Riyadh city. Data which are a crucial part of this research were not easy to obtain, because they are scattered among different governmental organisations in Riyadh city. A number of different datasets are required to develop the models; these include:

• Data on crashes which were obtained from Riyadh General Department of Traffic (RGDT) and the Higher Commission for the Development of Riyadh (HCDR)

• Land use data which were also collected from HCDR

• Road network data which were obtained from Riyadh Department of Transport (RDT)

• Socio-economic data obtained from Saudi General Directorate of Statistics (SGDS).

A series of meetings were held with the above organisations during two visits to Saudi Arabia in April 2009 and December 2009 to request and gather the data.

Since 2004 the Riyadh Traffic Department (RTD) has collected road crash data for the Riyadh region. These data were then made available to the Higher Commission for the Development of Riyadh (HCDR) for further processing. The road crash data for this study were obtained from the HCDR. These data cover a period of five years, namely AH 1425, 1426, 1427, 1428, and 1429 (equivalent to 2004, 2005, 2006, 2007 and 2008).

Some of the data gathered include age of the people involved in the crash, the time of day and the day of the week when the crash occurred. Figure 5.1 shows the distribution of age of the people killed in traffic crashes in Riyadh city for 2007. It is noticeable that the number of fatalities due to road crashes is relatively higher for the age groups 16-20, 21-25 and 26-30, knowing that this is not the majority age group (SGDS, 2005).

84

Figure 5-1Number of fatalities according to the age group in Riyadh for 2007 (RGDT, 2007).

As can be seen from Figure 5.2, it is clear that the period from 0600 to 0800 has the highest fatality rate during the day, which is not surprising as it is the time for trips to school and work in Riyadh.

Figure 5-2 Fatal crashes according to time of day in Riyadh city for 2007 (HCDR, 2008)

Referring to the statistical data from 2004-2007 produced by the Higher Commission for the Development of Riyadh in 2007, it is found that most crashes took place during Wednesdays and Thursdays, which are the days of the weekend in Riyadh, when most people travel for recreation and social activities (see Figure5.3). Furthermore, most crashes took place in the East and West parts of the city because of the high density of traffic and population in these areas.

85

Figure 5-3 Fatal crashes by week day in Riyadh city for 2007 (Source: HCDR 2008).

The details of the final data sets used in this study, including descriptive statistics of the variables to be employed in both crash frequency and crash severity models and the validation of data, are presented in this chapter.

5.2 Data description