Criterios de clasificación - Fundamentos de Salud Publica 1

4.5.1 Disease clusters and statistical inference

The results of local cluster detection found in this study provide a compelling argument for the spatial analysis of health data. Distinct spatial and spatio-temporal patterns of elevated disease incidence were found, most consistently around George Town and the North Esk region east of Launceston. However, while results clearly show the presence of local disease clusters, the statistical ‘significance’ of these clusters should be viewed with prudence. For example, the Getis Ord Gi* statistic determines cluster significance based on a comparison between disease incidence inside the (1 km) radial search window and the whole population. The resulting ‘significance’ is therefore largely dependent on the chosen size of the search window, the presence of spatial

autocorrelation in the dataset, and both the geographical size and spread of the entire study population (Getis and Ord, 1992). Similarly, the significance of clusters reported with Kulldorff’s spatial scan statistic is highly dependent on the size of the study area. This can be clearly seen in Figures 4.11 and 4.12. As Kulldorff and Nagarwalla (1995) themselves caution, “it is not meaningful to attribute significance to a cluster without reference to the study region." Note also that disease ‘clusters’ identified by both the Gi* statistic and Kulldorff’s spatial scan statistic included both disease cases and non- cases. That is, any given cluster (excluding those of a single point) is likely to contain a number of points with no recorded disease incidence. This requires careful

consideration of the results prior to making assumptions about houses identified as being in a ‘cluster’

Understanding issues of scaling is critical to the effective and accurate use of GIS in public health studies (Sexton et al., 2002, Wheeler, 2007 #135). The importance of defining the study area size is demonstrated in

Table 4-2; here, ‘global’ statistics have been calculated for each defined subregion. Hypothetically, if the ‘Waverly’ subregion represented the entire spatial extent of a study area, global statistics would return a total population disease incidence of 6.48%. It would not be known that disease incidence of this population was markedly higher than that of the population immediately surrounding the study area; and, it is possible that a cluster detection statistic would not find any local disease clusters within the study site. Similarly, the spatial extent of this study included only an area surrounding the Tamar Valley. This region has previously been reported as having the highest rate of “severe, chronic” Asthma in Tasmania (Giles, 1980). If a contemporary cluster detection study were to be done on the state of Tasmania as a whole, it is very likely that there would be different patterns of disease clustering observed within the Tamar Valley than those seen in the current study.

In addition to the disparity in statistical significance between datasets of different spatial extents, results also showed some disagreement between different cluster detection techniques. A comparison of FigureFigure 4.8 (A) andFigure 4.10 shows that the Gi* statistic found ‘significant’ clusters of Asthma on the western side of the Tamar that were not detected by SaTScan. Likewise, vast areas of the eastern Tamar that were included in a ‘significant’ Asthma cluster by SaTScan were not considered significant with the Gi* statistic. This problem has been reported in the literature. Wheeler (2007) commented that the different results returned from different

techniques (SaTScan and kernel intensity function ratio) caused him to wonder “which was the most trustworthy?” Aamodt et al (2006) also found differences between methods and suggest that SaTScan may be more sensitive to clusters of lower relative risk than other methods tested. Researchers are generally warned here not to rely on the results of a single analysis method alone (Sahsuvaroglu and Jerrett, 2007).

The search window used in SaTScan initially centres on a point and then increases in size until (a maximum of) 50% of the population is included (Kulldorff, 1997; Kulldorff, 2006). It would seem then that the size of a resulting cluster would also be dependent on its geographical position relative to the rest of the population. Consider an

elongated study area like the Tamar Valley with a population centre at either end; Launceston in the south has a far greater population than George Town in the north. Conceptually, a scan window around a point in the centre of Launceston would reach a size that incorporates 50% of the population before it reaches the population centre of George Town. As such, it would seem that a cluster in Launceston (which includes approx. 77% of the study population) will be influenced more heavily by the population immediately surrounding it than a cluster in George Town, which could have been formed by a much larger scan window, potentially reaching all the way to northern Launceston. This concept is illustrated in Figure 4.10, where it is seen that a very large cluster was detected in the sparsely populated area on the east of the Tamar, and a very small cluster was found in the city centre at Invermay. SaTScan has elsewhere been found to have difficulty detecting large clusters within large population centres, where the population distribution in the study area is not uniform (Aamodt et al., 2006). Irregularly shaped clusters are also generally more difficult to detect than circular or elliptic ones and the limitation of SaTScan to only incorporate either circular or elliptical search windows has also been noted by other researchers (Aamodt et al., 2006; Wheeler, 2007).

Another factor detracting from the statistical strength of findings in this study is the positional error inherent in the datasets. As detailed in Chapter 3, all address points were randomly moved up to 200 m from their ‘true’ location. Also mentioned was the

fact that the original ‘true’ locations of address points were already subject to some positional error in the Address Points Dataset as a whole (theLIST, 2007). For cluster detection statistics that are dependent on the relative location of disease cases within the population, these spatial errors may have influenced any perceived results.

4.5.2 Hospital admissions as a measure of morbidity

This study defined respiratory disease occurrence by Launceston General Hospital admissions records. This approach influenced the dataset in the following ways. Firstly, hospital admissions represent the ‘top-end’ of respiratory disease occurrence, as only the sickest in the population are admitted to hospital. There is a greater number of people who are treated in the Emergency Department and do not get admitted to hospital, and an even greater number of people who visit a General Practitioner or some other health care provider instead of entering the hospital system. Further still down the ‘health hierarchy’, many more people will increase the use of self medication (e.g. with Asthma inhalers) before seeking any medical assistance. Using only hospital admissions data therefore constricts the size of the possible study population.

Secondly, studying admissions to a public hospital necessarily precludes all respiratory patients who were admitted to private hospitals; this introduces a form of socio-

economic confounding that causes the poorer in society to be over-represented (Giles, 1980). Also, it has been commented that asthmatics living at the edges of a hospital catchment tend to seek medical treatment in their local areas rather than relying on the hospital system (Giles, 1980); thus it seems that those living in remote rural areas may also be under-represented by relying solely on hospital admissions. An advantage in dealing purely with hospital admissions data, however, is that human error is

minimised. ICD-9/10 codes were decided only by a small number of hospital staff from a single hospital over the study period, and therefore diagnostic discrepancies were kept to a minimum.

Fifteen years of accumulated hospital records were analysed simultaneously in this study to address the problems associated with small population studies. However, a change in the international disease coding system occurred in the study period, which had some impact on the way respiratory diseases were classified. The conversion from ICD-9-CM to ICD-10-AM between 1998 and 2000 caused some changes in respiratory disease diagnoses (AIHW, 2000; AIHW, 2001). Most notably, COPD diagnoses

increased considerably as diseases that were previously labelled chronic Asthma or Emphysema are now diagnosed as COPD (Wisconsin Department of Health and Family Services, 2004). As COPD was not included in analyses of ‘total disease’ in this study this effect may not be great. Also, the general decline in Asthma incidence over the study period seen in Figure 4.4 does not seem totally attributable to this change in

diagnosis. Therefore, while some effects of the changeover are noted, they are not considered substantial in this study.

4.5.3 Spatial autocorrelation and confounders

The vast majority of cluster detection studies in the literature define the clustering of disease cases as a departure from Complete Spatial Randomness (CSR) (Aamodt et al., 2006; Abrams and Kleinman, 2007; Durand and Wilson, 2006; Kingham et al., 2006; Kulldorff, 2006; Kulldorff and Nagarwalla, 1995; Sabel and Loytonen, 2004; Wheeler, 2007; Wilson et al., 2004; Wilson et al., 2006; Wilson and Zawar-Reza, 2006). CSR was the underlying assumption in each of the methods employed in this section of the study. However, it has been argued that a null hypothesis of CSR is not appropriate for environmental and social studies which are dealing with heterogenous background populations, or autocorrelation (Goovaerts and Jacquez, 2004; Wakefield and Shaddick, 2006).

While spatial autocorrelation is present in most if not all public health datasets, failure to account for it will further affect the statistical validity of any reported clustering (Goovaerts and Jacquez, 2004; Jerrett and Finkelstein, 2005; Lipton et al., 2005; Maheswaran and Craglia, 2004; Maheswaran and Haining, 2004; Sabel and Loytonen, 2004). Specifically, positive global autocorrelation in a dataset will return false positive results for local clustering (Getis and Ord, 1996; Ord and Getis, 1995). A suggested alternative is to adopt a Neutral model, which takes into account spatial

autocorrelation in the background population and tests the alternative hypothesis as a departure from that pattern (Goovaerts and Jacquez, 2004). Standardised Morbidity Ratios (SMR) can then be calculated based on background age structure and various other confounding factors (Goovaerts and Jacquez, 2004).

As mentioned, this study did not have access to personal or demographic data such as age, gender, socioeconomic status, smoking history and occupation. All of theses are known to influence the results of cluster detection, and most exhibit positive spatial autocorrelation (Carstairs, 2000; Hayes, 2003; Jerrett et al., 2003; Koch and Denike, 2004; Maantay, 2002; Sexton et al., 2002). “Confounding” generally has been

described as “the single most important problem” in studies of cluster detection

(Wartenberg and Greenberg, 1993) in: (Maheswaran and Craglia, 2004), and Rothman (1990) has gone so far as to suggest that it is of limited scientific value to conduct disease cluster analyses at all, given all the limitations of data quality and confounders. However, cluster detection studies are usually the precursor to further investigations (Besag and Newell, 1991) and even in the absence of confounding information, disease mapping and cluster detection contributes valuable information to pre-epidemiological

studies (Maheswaran and Craglia, 2004; Sabel and Loytonen, 2004). Cluster detection techniques were therefore used primarily as an exploratory tool in this study.

In document Fundamentos de Salud Publica 1 (página 96-100)