N UEVO ENFOQUE PARA LA PRODUCCIÓN DE SOFTWARE EDUCATIVO EN LA UCI

4.1 R

ANDOM AND SYSTEMATIC VARIATION IN ACCIDENT COUNTS

Road safety is usually deﬁned and evaluated in terms of the recorded number of accidents or the number of killed or injured road users. The number of accidents or injured road users recorded during a certain period is the result of a complex process.

There are two problems associated with the use of the recorded number of accidents to estimate safety: under-reporting of accidents (see Chapter 3) and random variation in the recorded accident numbers.

When looking for explanations of accidents and for ways of preventing them, it is important not to mix up random and systematic variation in the number of accidents.

Systematic variation is the ‘true’ variation in the accident counts, i.e. variation of the expected number of accidents. Random variation is variation of the observed accident counts around the expected number of accidents. These concepts are described in more detail below. When evaluating safety measures, it is often better to use estimates of the expected, rather than the recorded, number of accidents by using the Empirical Bayes (EB) method, which also is described below.

Expected number of accidents. The expected number of accidents is the number of accidents (e.g. on a speciﬁc road or in a speciﬁc junction) that one can expect per time unit, based on known properties of the road or junction. It is the average number of accidents that will occur per unit of time in the long run, given that exposure and all risk factors remain constant.

The Handbook of Road Safety Measures

ISBN: 978-1-84855-250-0

The meaning of the expected number of accidents can be clarified by means of an example.Figure 4.1shows hypothetical numbers of accidents recorded in a junction during a period of eight years. The black dots represent the recorded number of accidents per year and the white dots show the moving average of the annual counts of accidents. In the first year, this is the same as the count of accidents for that year. In the second year, it is the average of the two first years; in the third year, it is the average of first three years, etc.

It can be seen that the recorded number of accidents in a given year is not necessarily representative of the mean annual number of accidents at the junction in the period we are studying. We also see that as accident counts are accumulated for more years, the annual average number becomes more stable and less affected by the recorded number in a single year. If one were to collect accident data for the same junction over a very long period, e.g. 50 or 100 years, the annual average number of accidents for this period would eventually hardly be affected at all by the recorded number of accidents for a given year. In the limit, the annual average would be insensitive to the recorded number of accidents in a speciﬁc year. It would then be an estimate of the long-term expected number of accidents. This is the average number of accidents per unit time, which would be expected to occur in the long run at a constant exposure (amount of trafﬁc) and at a constant accident rate per unit of exposure.

However, during such a long period, it cannot be assumed that the junction has an unchanged amount of trafﬁc or otherwise remains unchanged. Thus the expected

Recorded number of accidents during eight years at a junction and mean of the annual numbers

0 1 2 3 4

0 1 2 3 4 5 6 7 8 9

Year

Number of accidents

Recorded number Annual mean

Figure 4.1: Illustration of the concept of the expected number of accidents.

number of accidents would not remain constant during such a long period. In practice, the expected number of accidents is seldom estimated by observing the accident history for a single junction, a single road section, or a single driver for a period of 50 or 100 years.

The true value of the expected number of accidents for a given unit of observation, such as a junction or a driver, cannot be observed directly but has to be estimated. The most common method of estimating the expected number of accidents is to study a large number of units (junctions, road sections, drivers, vehicles, etc.), which vary with respect to characteristics that are believed to inﬂuence the expected number of accidents. By means of statistical analysis, we then try to determine the amount of systematic variation in accident counts and identify factors that produce it.

Random and systematic variation in accident counts.There is systematic variation in the number of accidents when some units (junctions, drivers, vehicles, roads) have a higher or lower long-term expected number of accidents than other units of the same type.

Random variation in the number of accidents is variation in the recorded number of accidents around a given expected number of accidents. Two sets of factors generate systematic variation in the number of accidents:

The amount of trafﬁc (exposure)

Risk factors (factors that affect the probability of accidents at a given exposure).

On top of these come vehicle occupancy and other factors that inﬂuence the number of injury victims per accident.

The fact that road accidents are subject to random variation means that not all changes in the recorded number of accidents imply changes in the expected number of accidents. For example, a decrease from 280 fatalities per year to 250 is not more than random variation. A decrease from 10,000 injured to 9,500 people injured is large enough for it not to be exclusively attributable to chance.

The problem of not mixing up random ﬂuctuations of the number of accidents with changes in the long-term expected number of accidents is most severe when the mean expected number of accidents per study unit is small. To illustrate the problem, consider the hypothetical case of 100 junctions that have a mean expected number of accidents of 1.5 per junction per year. Assume further that the recorded number of accidents in any junction is the result of pure random variation, i.e. all junctions have the same expected number of accidents. Suppose a road safety measure reduces the mean expected number of accidents per junction per year to 1.0, still assume a random distribution of accidents in the set of junctions.

A simple before-and-after study will then most likely observe a reduction of the recorded number of accidents in 50 junctions, an increase in 24 junctions, and an unchanged number of accidents in 26 junctions. Apparently, the safety measure will be most effective in those junctions that had the highest recorded number of accidents in the year before. In the 20 junctions that had three or more accidents in the before-period, and a total of 69 accidents, one would ﬁnd a reduction to 17 accidents (a 75% reduction). This appears to be a far greater reduction than the mean reduction of 33% (from a total of 150 to 100 accidents). Such an impression is, however, misleading. Under the assumptions made in this example, the true effect is identical in all junctions – any observed variation in effect is random only. Identifying junctions where the safety measure was particularly effective, based on the recorded number of accidents in the before-period, would be capitalising on chance.

Statistical modelling of systematic and random variation in accident numbers. Pure random variation in accidents is usually modelled by the Poisson probability law.

According to the Poisson probability law, the variance of the count of accidents equals the mean. The smaller the size of the standard deviation, calculated as a percentage of the number of accidents, the greater the number is. For example, the standard deviation in 10 accidents is equal to about three accidents, i.e. 30%. The standard deviation in 100 accidents is equal to 10 accidents, i.e. to say 10%. A 95% conﬁdence interval for random variation in the number of accidents can be obtained by multiplying the square root of the number of accidents by 1.96. For example, the 95%

conﬁdence interval for an expected number of accidents of 10 is 10 1:96 p

10 ¼ 10 1:96 3:16 ¼ 10 6:2

The lower limit of the conﬁdence interval is 3.8 and the upper limit is 16.2.

Multivariate statistical models, often Poisson regression models or negative binomial regression models, are increasingly used to analyse factors that explain systematic variation of the number of accidents. The most common speciﬁcation of these models is

Expected number of accidents ¼ aQ^bexpP

where Q measures exposure, i.e. some variable describing trafﬁc volume. Exp is the exponential function, i.e. the base of natural logarithms (e ¼ 2.71828) raised to the sum of parameter estimates multiplied by the relevant values of the explanatory variables, representing risk factors (Skx). For an in-depth presentation of multivariate accident modelling, the reader is referred toGaudry and Lassarre (2000).

Modelling the expected number of accidents in before-and-after studies with the Empirical Bayes method. Results from before-and-after studies may be misleading when evaluation studies are based on the recorded number of accidents, especially when the recorded number is small, and when the study units selected had higher than normal recorded numbers of accidents in the before period. When a measure is implemented only for units with high numbers of accidents in the before-period, the number of accidents will most likely be smaller in the after period, even if the measure has no effect at all. This is referred to as the regression to the mean effect. Regression to the mean may be controlled by using the expected, instead of the recorded, number of accidents in the before-period. Since the expected number is never known exactly, it has to be estimated. By means of the EB method, the expected number of accidents (e.g. on a road section or at a junction) can be estimated as follows:

It is estimated how many accidents would normally be expected in a unit with comparable properties (risk factors and exposure), based on a multivariate model of accident occurrence in a (preferably large) number of the same type of units, with varying properties. In addition to the normal expected number of accidents, the uncertainty of this estimate is calculated.

It is estimated how many accidents would be expected for the actual unit, by combining the normal expected number of accidents (step 1) and the recorded number of accidents. The observed number of accidents is included in order to take into account speciﬁc unobserved risk factors (that are not included in the accident model in step 1). The expected number of accidents is assigned a statistical weight that corresponds to the uncertainty of this estimate and that can assume values between 0 and 1. The expected number of accidents for the speciﬁc unit is calculated as follows:

Expected number of accidents for the specific unit

¼Expected number of accidents Statistical weight þObserved number of accidents ð1 Statistical weightÞ

The observed number of accidents in the after period is compared to the expected number of accidents that has been estimated for the speciﬁc unit in the before period.

A more detailed description of the EB method, the statistical background and applications are given inHauer (1997).

4.2 T

HE USE OF ACCIDENT RATES TO MEASURE SAFETY

It has traditionally been assumed that the effects of trafﬁc volume on the number of accidents can be removed – controlled for – by estimating an accident rate:

Accident rate ¼Number of accidents Traffic volume

This assumption is not correct (Hauer 1995). Most accident rates, which are deﬁned per vehicle kilometre or per person kilometre, have a signiﬁcant non-linearity, i.e. the assumption that the number of accidents is independent of the distance driven or the amount of travel does not hold.Figure 4.2shows a very striking example of this, taken from a British study (Forsyth, Maycock and Sexton 1995).

Accident rate declines sharply as annual driving distance increases. The mean accident rate for men is 0.345 and for women is 0.389. Women have a higher mean accident rate than men, despite the fact that their accident rate for any given annual mileage is lower than the accident rate for men. If this fact were not known, one might erroneously conclude that women are poorer drivers than men.

The ﬁnding presented inFigure 4.2is a case of Simpson’s paradox, which may occur when data exhibiting strong non-linearity, or a strong interaction between two or more

Relationship between annual driving distance and accident rate (Forsyth, Maycock and Sexton, 1995)

0 50 100 150 200 250

0 5,000 10,000 15,000 20,000 25,000

Annual driving distance (miles)

Accident rate (accidents per million miles)

Men Women

Mean for men: 34.5 at 8,350 miles per year Mean for women: 38.9 at 4,766 miles per year

Figure 4.2: Relationship between annual driving distance and accident rate (Forsyth Maycock and Sexton 1995).

factors, are aggregated across categories of the non-linear function or the variables that interact. This can result in a fallacy of aggregation: in this case to an erroneous conclusion that the accident rate for women is higher than it is for men.

Non-linear relationships between trafﬁc volume and accidents have also been found in most studies that have developed accident models for roads or junctions. In most studies, the percentage increase of the number of accidents is smaller than would be expected if there were a linear relationship, i.e. the number of accidents increases at a lower rate than trafﬁc volume (see also Section 3.4).

Consequently, the effects of exposure on accidents are not adequately controlled by estimating accident rates and accident rates may have limited value as a measure of road safety. Road safety evaluation studies that use accident rate ratios as the dependent variable are of dubious validity unless the accident rate ratio applies to study units that have an identical amount of exposure and are otherwise identical with respect to at least major risk factors affecting the number of accidents.

4.3 E

XPLAINING ROAD ACCIDENTS

–

THE CONCEPT OF CAUSE

Do accidents have causes? If they do, how can we make sense of the term ‘cause of accident’? Until around 1960, it was widely believed that it was not possible to reduce road accidents effectively without knowing the ‘real causes of traffic accidents’. This opinion was expressed in the first parliamentary report on traffic safety in Norway (Ministry of Justice, Parliamentary report 83, 1961–62, On measures for promoting traffic safety), stating:

‘‘A thorough planning of measures to prevent traffic accidents is of great significance if good results are to be achieved. If planning is to be effective, it is necessary to know and analyse the problems in traffic at which the measures can be directed. It is not possible at present to implement road safety planning in a totally satisfactory way.

Sufficient knowledge of the real causes of accidents is not available and as a result, the best remedies are not known either. It is usually a complex set of causes that result in traffic accidents; this makes it difficult to evaluate the importance of the individual causal elements.’’

Others have rejected the use of the concept of cause in explaining accidents (Haight 1980). Accidents are the outcome of a vastly complex random process, whose general characteristics can be modelled statistically. Some of the factors that inﬂuence the stochastic process leading to accidents are known; others will never be known.

The logic of the argument that you need to know the causes of a problem in order to solve it seems irresistible. Yet, there is not necessarily a very close connection between the causes of the problem and its solution. To see why this is not the case, it may be instructive to consider in detail some of the approaches that have been taken to explaining road accidents and discuss their implications.

Theories of accident causation – a brief chronology. The scientiﬁc study of accidents started about 100 years ago. At least since that time, theories have been proposed to answer the question: Why do accidents happen?

While easily asked, this is indeed a very difﬁcult question to answer. Useful discussions can be found in a number of books. In particular, books by the following authors are recommended:

Cresswell and Froggatt (1963) Shaw and Sichel (1971) Evans (1991)

Wilde (1994).

Five different theories trying to explain accidents will be brieﬂy discussed.Figure 4.3 lists the theories in chronological order and indicates the heyday periods of the various theories.

Accidents as random events. Accident research started 100 years ago whenBortkiewicz published his book entitled The Law of Small Numbers (Leipzig 1898). Bortkiewicz studied the frequency of deaths from horse kicks in the Prussian army. He found that the distribution of the number of deaths per army corps per year was almost perfectly random. To describe the random process leading to accidents, he used the Poisson model. This model ﬁtted the actual distribution of accidents very closely. Bortkiewicz’s results led to acceptance of the idea that accidents were purely random events over which humans had no control.

1900 1920 1940 1960 1980 2000

Accident proneness theory

Causal accident theory Systems theory

Behavioural theory Accidents as random events

Figure 4.3: The heyday periods of various accident theories.

Accident proneness theory. The view that accidents were purely random events was shaken during the First World War, whenGreenwood and Yule (1920)discovered an abnormal concentration of accidents involving a few workers in munitions factories.

These workers had far more accidents than randomness alone could explain.

Greenwood and Yule proposed different statistical models to explain the observed distribution of accidents. The simplest of these models that adequately described the observed distribution of accidents was the negative binomial model.

This model was based on an assumption of different initial accident liabilities. Some people were, in other words, more prone to have accidents than others. This reorientation of accident theory coincided with a surge of innovations in psychology.

Psychoanalysis became widely known through the writings of Sigmund Freud. The ﬁrst intelligence tests and personality tests were developed. The belief soon took hold that it was possible by means of psychological tests to identify people who were particularly prone to accidents and deny them access to the activities where they were causing accidents. This point of view was predominant in accident research from about 1920 until about 1950.

The pendulum had moved from one extreme to the other. From maintaining that accidents were entirely random, the conventional wisdom now held that accidents were the fault of a few people with some sort of personality disorder. An important finding undermining accident proneness theory was made as early as 1939 by ThomasForbes (1939). He found that most car accidents were caused by ordinary drivers. Although only 1% of the drivers were involved in 23% of all accidents during the 1931–33 period, the same 1% of drivers were only involved in 4% of all accidents during the 1934–36 period. Most accidents during the latter period involved drivers who did not have any accidents during the first period. Forbes had actually demonstrated the effects of regression to the mean, although he did not himself use that term to describe his finding.

Growth of mass automobilism in the 1940s and 1950s in the United States, and the attendant growth in the number of accidents, made it clear that road accidents can happen to everybody, not just a few particularly clumsy people. It was felt that the

In document Principios Estrategicos para la Guia del Proceso de Desarrollo de Software educativo en la Universidad de las Ciencias Informaticas. (página 79-121)