• No se han encontrado resultados

OBJETIVOS ESPECÍFICOS

2. CAPÍTULO MARCO TEÓRICO

2.3. DIMENSIONES DE ANÁLISIS: CONCEPCIONES DE DESARROLLO EN TORNO A LOS ESTUDIANTES

2.3.1. DIMENSIÓN: CONCEPCIÓN DE DESARROLLO (CD)

2.3.1.1. Planos del Desarrollo

2.6.3.1 Assessing the Homoscedasticity Assumption

In assessing our provisional model for non-constance error variance, the Pearson resid- uals, equation 12, were sorted into groups of 10 based on the order of their corresponding sorted fitted values (Zuur et al. 2011). The mean of the Pearson residuals in each of these groups was then computed and plotted against the fitted values to assess ho- moscedasticity. This approach of assessing the homoscedasticity assumption in binary logistic regression models, however, tends to work better for large data sets (Zuur et al. 2011). i= Yi−πˆi p ˆ πi(1−πˆi) (12)

where Yiis the observed response for theith participant, ˆπiis the fitted value correspond- ing to Yi and

p ˆ

2.6.3.2 Assessing the Assumption of Independence

In line with our study hypothesis, we expect the transmission of schistosome infections to assume heterogeneous, rather than random patterns within each of our study communit- ies. Such non-random patterns are, however, often the results of complex interactions between the measured covariates and specific micro-level factors that act in conjunc- tion with the common risk factors. Therefore, in assessing the effect of these micro-level factors on the validity of our models, certain assumptions were made. These assumptions are discussed in the next two sections.

2.6.3.2.1 Assumptions Regarding the Effects of the Micro-Level Factors

We began by assuming that the micro-level factors were unobserved latent factors specific to each of our study sites. These factors were also thought to have acted in conjunction with the measured covariates to influence exposure patterns in the study communities. Moreover, in order for the collective effects of these micro-level factors to be evident, we further assumed that they were mostly activities that may have occurred in a socio- temporal context among groups of individuals, rather than as discrete activities that were performed at the individual-levels.

For instance, the infective stages of the schistosome parasites, the cercariae, are known to exhibit diurnal rhythms where their peak densities in the freshwater habitats occur between noon and 15:00 GMT (Farooq & Mallah 1966). Therefore, human exposure activities that occur within this peak period are bound to be associated with higher risk of infectivity. Moreover, since the water bodies that serve as sources of trans- mission tend to be major sources of livelihood in endemic settings, human exposure patterns may also be linked to the major economic activities specific in these settings.

Therefore, if similar levels of infectivity were assumed across the different points of exposures, then the time of exposure may potentially contribute to the varying risk of infection in the human host population. Hence, we could logically assume that for any given endemic setting, groups of inhabitants involved in the same occupa- tional activities may experience similar risk due to similarities in the time of exposure.

2.6.3.2.2 Assumptions Regarding the Residuals

The residuals of our provisional model represented the variation in the risk of schis- tosome infections that was not accounted for by the measured covariates (Arlinghaus 1996). Hence, it follows that these residuals represented the effect of the micro-level factors as well as the random noise in the data. Any apparent patterns in these re- siduals may, therefore, signify the clustering of risk due to the effects of micro-level factors acting in specific parts in any of our study sites. We, therefore, proceeded to investigate heterogeneities in the patterns of schistosome transmissions by conducting an assessment for spatial autocorrelation in the standardised point-referenced residuals.

2.6.3.2.3 Principle of Spatial Autocorrelation

The principle of spatial autocorrelation is based on Tobler’s first law of geography which states that “everything is related to everything else, but near things are more related

than far things” (Waller & Gotway 2004). This principle, therefore, captures the distance

decay concept in spatial statistics where nearby objects are thought to share more similar attributes, and hence could no longer be considered as being independent of each other (Arlinghaus 1996).

In the context of this study, the clustering of risk would be expected to be evident in the point-referenced residuals if the inhabitants of any of our study communities, who also happened to live in close proximities to each other, were exposed to similar micro- level factors. Therefore, using the standardised Pearson residuals, which represented the variation in the risk of infections that was unaccounted for by our provisional model, we preceded to conduct an assessment for spatial autocorrelation. In the next section, we focus on the classical geostatistical convention for effecting the decomposition of our provisional model into covariate information and residuals.

2.6.3.2.4 Classical Geostatistical Concepts

Following classical geostatistical convention, residuals (zi) from the various sampled loca- tions,s, within our study communities could be regarded as samples of a single realisation

of an underlying random and spatially continuous process, Z. These observed realisa- tions are therefore used in drawing statistical inferences about the random function, Z. The spatial distribution of Z is specified by the mean, µ and covariance or variogram i.e. the first two moments (equation 13) (Gelfand et al. 2010, Waller & Gotway 2004).

E[Z(s)] =µ (13)

CovZ(sj),Z(sk)

=C(sj−sk)

whereC(.), the covariance function, measures the spatial autocorrelation between sampled locations sj and sk. Under the assumption of second-order stationarity for the random function, Z,µis independent of location and the covariance only depends on the separ- ation distance between sj andsk(Waller & Gotway 2004).

2.6.3.2.5 Variogram Analysis

The variogram, defined by equation 14, is the geostatistical tool for measuring the spatial dependence between the residuals, zi, at sampled locations sj and sj + h. The choice of appropriate lags, h, is conventionally based on the mean distance between pairs of sampled locations (Figure 9) (Myers 1997). The dependence in the spatial process is evidenced by small spatial lags, h, between sampled locations due to the similarity in the values of residuals, zi (Figure 10) (Zuur et al. 2007). In computing the empir- ical variogram, specific functions in the R geoR package by Jr & Diggle 2012 were em- ployed. The variogram was computed for 13 spatial lags with a lag tolerance of±22.5◦.

γ(h) = 1 2p(h)

pX(h)

α=1

{zi(sj)−zj(sk)}2 (14)

wherep(h) denotes the numbers of pairs of Pearson residuals that are separated byh, the spatial lag, h, is the distance separating any given set of pairs of residuals while zi and zj are the residuals for theith and jth participants at locationssj and sk, respectively.

Figure 9: A schematic representation of the lag distances, d1 - d3, and residuals

that occur within each lag (adapted from Myers 1997 with permission of the rights holder, John Wiley and Sons).

Sill

φ τ2

σ2

Spatial Lag

Figure 10: An example of a typical variogram (adapted from Waller & Gotway 2004 with permission of the rights holder, John Wiley and Sons). The variance of the

spatial process, Y, is given by the sum of the measurement error variance, τ2, and

the signal variance, σ2 whilst φdenotes the range within which residuals are spatial

2.6.3.2.6 Directional Dependence

As discussed in section 2.1 above, the pre-intervention sites also lacked any of the operational components of transmission control. Hence the patterns of schistosome transmission in those communities could be attributed primarily to the distribution and abundance of the intermediate aquatic molluscan hosts (Brooker 2007). There- fore, if the degree of infectivity were assumed to be constant across all the points of exposure, i.e. the water contact sites, then we could also ideally speculate that the strength of the spatial correlation would be the same in all directions assuming trans- missions were autochthonous. Hence, the omni-directional empirical variogram could ideally be regarded as the mean variogram for all spatial directions in such instances.

However, if the degree of infectivity did vary across the different points of exposure, due to site-specific factors that influenced the ecology of the intermediate snail host species, then we could adopt the logic by Vounatsou et al. 2009 in arguing that the spatial dependence would be stronger in the direction of the higher transmission points. This logic would, however, only hold if the different points of exposure were constantly accessed by the same group of individuals and there was limited interaction in expos- ure patterns across the different sites (this is investigated in Part III of this thesis).

Though the latter argument seems more plausible, it also suggests a potential viola- tion of the isotropy assumption which renders the omni-directional empirical variogram inadequate. We, therefore, computed directional variograms to verify the isotropy as- sumption and correct for anisotropy where necessary.

2.6.3.2.7 Statistical Significance of the Spatial Dependency

The position of the isotropic empirical variogram within an envelope of random permuta- tions, computed by the random allocation of the residuals to different sampled locations within the study region, was initially used in assessing if any observed spatial trend in the residuals may have occurred by chance (Diggle & Ribeiro 2007). A formal test for trend by Eagle & Diggle 2012, which is based on a null hypothesis of the absence of spatial autocorrelation, was then employed in computing the test statistic and p-value.

Therefore, a statistically significant spatial autocorrelation was interpreted as signifying residual spatial variation due to the effect of the unobserved micro-level factors. Since this had the effect of overestimating the models’ precision, the next step in the analysis was to describe the spatial dependency with the appropriate correlation structure and incorporate the correlation into the model. We, however, defer any discussion on the model adaptation to Part II of this thesis.