2.11. Comprobación de la hipótesis
2.11.1. Prueba de hipótesis
2.11.1.1. Prueba T de Student para muestras relacionadas
The Mycobacterium species community composition in water samples demonstrated significant variation with latitude (r = 0.2, P = 0.005) even when other bioclimatic variables were controlled for (Table 5.9). In concordance with this finding, the CCAs of differences in OTU composition demonstrated that latitude (P = 0.043) was the only significant factor explaining variation (Figure 5.12A) and this remained the case after data was randomly resampled to 385 sequences per sample (Figure 5.12B). Differences in phylogenetic relatedness between villages illustrated latitudinal gradient across principal component 1, which explained 21.98% of the variance (Figure 5.13A). This gradient was also observed in the unweighted PCoA of the randomly resampled data (385 sequences per sample), explaining 17.27% of the variance (Figure 5.13C). While weighted analysis did not demonstrate a latitudinal gradient, the southern latitude villages did cluster together (Figure 5.13B). Spatial variation was the only significant factor for differences in the Shannon diversity index for water samples, as samples from the south were more diverse (CC = -0.1, P
= 0.022) (Table 5.10); however this did not explain much of the variation R2 = 0.1 (Figure 5.14A). The additional diversity estimates also displayed a linear relationship, however only weak correlations were found (Figure 5.14B&C). Latitude was consistently the only factor to explain the variation in Mycobacterium species composition and diversity in water samples, perhaps suggesting mechanisms of dispersal limitation and no apparent influence of environmental factors.
167
Table 5.9. Mantel and Partial Mantel tests for the Mycobacterium genus water dataset and the SG water dataset both comprising of 42 water samples. Asterisks represent the relationships that were significant at the P ≤0.05 level. P values are Bonferroni corrected (P
value multiplied by the number of tests: 5). Environmental variables Mycobacterium genus dataset SG dataset Effect of: Controlling for : r P value Corrected P value R P value Corrected P value Temperature - 0.051 0.274 1.370 -0.069 0.813 4.065 Temperature elevation 0.020 0.375 1.875 -0.012 0.944 4.720 Temperature pH 0.046 0.280 1.400 -0.08 0.837 4.185 Temperature longitude 0.027 0.360 1.800 -0.121 0.97 4.850 Temperature latitude 0.049 0.260 1.300 -0.073 0.823 4.115 Elevation - 0.057 0.250 1.250 0.045 0.271 1.355 Elevation temperature 0.032 0.334 1.670 0.113 0.089 0.445 Elevation pH 0.041 0.298 1.490 0.017 0.385 1.925 Elevation longitude 0.020 0.393 1.965 -0.029 0.645 3.225 Elevation latitude 0.033 0.333 1.665 0.023 0.348 1.740 pH - 0.120 0.143 0.715 0.207 0.021* 0.105 pH temperature 0.118 0.132 0.660 0.211 0.020* 0.100 pH elevation 0.113 0.142 0.710 0.203 0.020* 0.100 pH Longitude 0.115 0.145 0.725 0.201 0.026* 0.130 pH Latitude 0.094 0.173 0.865 0.186 0.037* 0.185 Longitude - 0.068 0.188 0.940 0.11 0.065 0.325 Longitude Temperature 0.052 0.278 1.390 0.148 0.029* 0.145 Longitude Elevation 0.041 0.244 1.220 0.104 0.038* 0.190 Longitude pH 0.059 0.216 1.080 0.097 0.094 0.470 Longitude Latitude 0.051 0.249 1.245 0.095 0.111 0.555 Latitude - 0.204 0.001* 0.005* 0.188 0.001* 0.005* Latitude Temperature 0.204 0.003* 0.015* 0.189 0.006* 0.030* Latitude Elevation 0.199 0.001* 0.005* 0.184 0.006* 0.030* Latitude pH 0.190 0.005* 0.025* 0.163 0.009* 0.045* Latitude Longitude 0.119 0.004* 0.020* 0.18 0.004* 0.020*
168
Figure 5.12. CCA plots of the Mycobacterium genus water dataset comprising of 42 water samples. Villages are denoted by the abbreviation of village name and number. (A) CCA plot of all sequences, the variation was explained by latitude (R2 = 0.18, P = 0.001) is shown by the arrow. (B) CCA plot of a random resample (385 sequences per sample), the variation was explained by pH (R2 = 0.19, P = 0.036) and latitude (R2 = 0.13, P = 0.039).
169
Figure 5.13. PCoA plots of the Mycobacterium genus water dataset comprising of 42 water samples. Sample points are coloured by latitude, the gradient from red to blue represent south to north latitudes (A) Unweighted analysis of all sequences (B) Weighted analysis of all sequences (C) Unweighted analysis of a random resample (385 sequences per sample).
170
Table 5.10. Univariate analysis of variables associated with the outcome variable Shannon diversity (H’) of Mycobacterium genus species in 42 water samples (Pseudo R2 = 0.13). Asterisks represent the relationships that were significant at the P <0.05 level. CC = correlation coefficient. CI = Confidence interval.
Variables Range CC (95% CI) P value
Elevation (m) 370-3958 0.0001 (-0.0002-0.0003) 0.461
Temperature(°C) 11.65-39.13 -0.028 (-0.060-0.005) 0.092
pH 2.95-5.61 0.210 (-0.222-0.642) 0.341
Latitude (°N) 4.705306-12.76961 -0.121 (-0.225--0.018) 0.022*
Longitude (°E) 34.263-39.87669 0.055 (-0.095-0.205) 0.469
Figure 5.14. Linear relationships between diversity of the Mycobacterium genus water dataset (42 samples) (A) The Shannon diversity estimate (H’) (R2 value = 0.128). (B) The fraction of OTU richness (R2 value = 0.127). (C) Phylogenetic Diversity (PD) metric (R2 = 0.012). Models were chosen based on the lowest AIC values.
The SG depicted similar biogeographical trends, as latitude (P = 0.005) was the only factor to remain significant after Bonferroni correction and contro lling for other factors (Table 5.9). In agreement, the CCA illustrated that latitude was the sole significant factor (P = 0.003), as samples from the northern regions Woldiya and Gonder clustered away from the southern villages (Figure 5.15A). This was consistent with the CCA of the random resampled data (950 sequences per sample), however other factors such as elevation, longitude and pH became significant (Figure 5.15B). The phylogenetic dissimilarities as depicted in the PCoA demonstrated a latitudinal gradient for the weighted analysis (Figure 5.16B), but this was not observed in the unweighted analysis of all sequences and the randomly resampled
171
data (Figure 5.16A&C). This suggests that latitude is particularly important in explaining the variation of abundant SG. Multivariate analysis revealed that both latitude and longitude remained significant after controlling for the other bioclimatic factors for differences in the Shannon diversity estimate (Table 5.11).
The diversity of the SG (R2 = 0.3, P = 0.0001) in water sources displayed a monotonic decrease with latitude, from southern to northern regions as inferred by the Shannon estimate (H’) (Figure 5.17A). However, no strong correlation was observed between latitude and the other diversity metrics, which were the fraction of OTU richness and the PD metric (Figure 5.17B&C). In concordance with the
Mycobacterium genus dataset the diversity and community composition of SG was influenced mainly by latitude. The analysis suggests that the southern areas have a higher diversity of SG, perhaps due to the environmental conditions specific to the south or perhaps due to historical events causing these spatial differences.
172
Figure 5.15. CCA plots of the SG water dataset comprising of 42 water samples. Villages are denoted by the abbreviation of village name and number. (A) CCA plot of all sequences, the variation was explained by latitude (r2 = 0.37, P = 0.003) is shown by the arrow. (B) CCA plot of a random resample (950 sequences per sample), the variation was explained latitude (r2 = 0.35, P = 0.001), elevation (r2 = 0.19, P = 0.006), pH (r2 = 0.21, P = 0.011) and longitude (r2 = 0.15, P = 0.048). Villages highlighted in green and red are southern and northern latitudes respectively.
173
Figure 5.16. PCoA plots of the SG water dataset comprising of 42 water samples. Sample points are coloured by latitude, the gradient from red to blue represent south to north latitudes (A) Unweighted analysis of all sequences (B) Weighted analysis of all sequences
174
Table 5.11. GLM for the Shannon diversity for the SG water dataset (42 samples) (A)
Univariate analysis (B) Multivariate model of associated variables, pseudo R2 = 0.385. Asterisks represent the relationships that were significant at the P <0.05 level. CC = correlation coefficient. CI = Confidence interval.
A
Variables Range CC (95% CI) P value
Elevation (m) 370-3958 -0.0002 (-0.0003--0.0001) 0.000* Temperature(°C) 11.65-39.13 0.017 (0.003-0.032) 0.021* pH 2.95-5.61 -0.115 (-0.316-0.087) 0.264 Latitude (°N) 4.705306-12.76961 -0.091 (-0.129--0.053) 0.000* Longitude (°E) 34.263-39.87669 -0.095 (-0.154--0.037) 0.001* B
Variables Range CC (95% CI) P value
Latitude (°N) 4.705306-12.76961 -0.079 (-0.115--0.043) 0.000*
Longitude (°E) 34.263-39.87669 -0.055 (-0.098--0.012) 0.011*
Figure 5.17. Linear relationships between diversity of the SG water dataset (42 samples) (A) The Shannon diversity estimate (H’) (R2
value = 0.338). (B) The fraction of OTU richness, data points represent the number of different OTUs per sample divided by the total number of different OTUs for all samples (R2 value = 0.057). (C) Phylogenetic Diversity (PD) metric takes into account the fraction of total branch length for each sample (R2 = 0.045). Models were chosen based on the lowest AIC values.
175
5.10. The effect of different spatial scales on the variation in species diversity