3.1. APLICACIÓN INTERACTIVA ENFOCADA EN LA
3.1.2. DESARROLLO DE LA APLICACIÓN GINGA
All of the following analyses were carried out using R version 3.5.1(R Core Team, 2018).
The plant community data from both the field and GIS data were reshaped into a community abundance matrix by converting percent cover to relative abundance. At this point, all abundance data were at the spatial resolution of the quadrats (2 m x 2 m). As per the field sampling, each plot was made of nine quadrats. For every plot, the relative abundance of the cover class was calculated as the average values from each of its nine quadrats. The community abundance data were then processed at both the quadrat (2 m x 2 m) and plot (6 x 6 m) levels with 576 and 64 units, respectively (Table 4).
Table 4. Community data were analysed at the quadrat and plot levels, with 4 GIS community datasets (1 for each resolution of the data) being compared against 1 field community dataset at both quadrat and plot levels. Observations = the number of sites in each
dataset. Number of ground cover classes = total different classes used in the community analysis. Variation in the number of classes was due to resampling methodology.
Dataset Description Observations Number of ground
cover classes
Veg Raw field data at the quadrat (2 m x 2 m) 576 29
Veg2 Raw field data at the Plot (6 m x 6 m) 64 26
Gdat GIS-derived community data sampled at 0.3
m at the quadrat (2 m x 2 m) level 576 12
gdatII GIS-derived community data sampled at 0.3
m at the plot (6 m x 6 m) level 64 12
Gdat2 GIS-derived community data sampled at 0.5
m at the quadrat (2 m x 2 m) level 576 11
Gdat2II GIS-derived community data sampled at 0.5
m at the plot (6 m x 6 m) level 64 11
Gdat3 GIS-derived community data sampled at 1 m
at the quadrat (2 m x 2 m) level 576 12
Gdat3II GIS-derived community data sampled at 1 m
at the plot (6 m x 6 m) level 64 12
Gdat4 GIS-derived community data sampled at 0.1
m at the quadrat (2 m x 2 m) level 576 12
Gdat4II GIS-derived community data sampled at 0.1
m at the plot (6 m x 6 m) level 64 12
2.4.1
Clustering and Ordinations
Modified TWINSPAN
To compare the compositions of the plant communities of Kaitorete Spit, for both the field data, and the GIS data at each of the four spatial resolutions, Modified Two-way Indicator Species Analysis (TWINSPAN(M. Hill, 1979)) was carried out using the twinspanR R package (Zeleny, Smilauer, Hennekens, & Hill, 2016). TWINSPAN is a hierarchical divisive classification technique that is widely used in community ecology (Rolecek, Tichy, Zeleny, & Chytry, 2009). The algorithm places all sites along the first axis of correspondence analysis, then iteratively divides the sites into two using a discriminant function based on a particular species association towards one half or the other(M. O. Hill, Bunce, & Shaw, 1975). Modified TWINSPAN builds upon the original algorithm by adding an additional analysis of cluster heterogeneity prior to each division (Rolecek et al., 2009). TWINSPAN uses the concept of pseudospecies to quantifiably model the qualitative concept of differential
species (i.e. species with distinct niche preferences). For each species, its range of relative
abundances (as %) is split into a pre-defined set of dummy variables which are decided by users (Legendre & Legendre, 2012b).
TWINSPAN pseudospecies cut levels were adjusted to 0, 1, 25, 50 and 75 (based on % cover) and the minimum group size was set to 50 and 5 for the quadrat and plot levels, respectively. Maximum cluster number was set at three based on the results of cluster validation from the package clValid (Brock, Pihur, Datta, & Datta, 2008). TWINSPAN was not supported by clValid, so a similar divisive clustering algorithm (Diana) was used in its place. The number of clusters used for each TWINSPAN algorithm was determined for each dataset based on the Objective Function Score from the package RankAggreg (Pihur, Datta, & Datta, 2018). Values were calculated based on the results of clValid, which allows for iterative validation of different cluster amounts for a given clustering algorithm based on a series of validation measures from the R package “clValid”.
nMDS Ordinations
Using the R package labdsv (Roberts, 2016), a dissimilarity index based on the Bray-Curtis distance for each of the 10 datasets was calculated. In dissimilarity measures, a value of 1 between a set of objects indicates complete dissimilarity, while a value of 0 indicates an exact match of all descriptors (Legendre & Legendre, 2012a) An asymmetric dissimilarity measure was used, as is typical with species abundances, due to there being a high chance of “double-zero” occurrences (Ricotta & Podani, 2017). The absence of a given species from a site was likely due to a number of reasons, e.g., competitive exclusion from invasives, herbivore damage or the nature of a high-disturbance environment, such as an exposed dune system like Kaitorete Spit. In addition with some species or habitat types being confined to small, distinct areas (pers obs), the higher rates of local rarity would likely further increase the number of double zeros in the species matrix. Therefore, the absence of a species from a pair of sites cannot be used as a measure of similarity with any confidence due to the complexity of the n-dimensional niche of a given species (Borcard, Gillet, & Legendre, 2018). Nonmetric multidimensional scaling (nMDS) was used to visualise the clusters using the R package “vegan” (Jari Oksanen et al., 2018). For the nMDS, the Bray-Curtis dissimilarity index was used with three dimensions and 200 random starts. The nMDS was used as it has several advantages: first it is capable of utilising a non-Euclidean distance matrix, such as the Bray-Curtis, second data are not assumed to be of normal distribution, last, there is no assumption of a linear relationship between the variables and any underlying gradients (legendre & legendre, 2012c). Non-symmetric Procrustes rotations were used to test the ordination similarity between the field and GIS data at both the quadrat and plot level.
Comparison of ordination and clustering results was done via superimposing the TWINSPAN groups onto the ordination diagrams. Comparison of the GIS data to the field community data was done using Procrustes Rotation from the R package “vegan”
Indicator Species Analysis
Indicator species analysis was carried out using the R package “labdsv”. This package uses the function “indval’, which is based on the original equations of (Dufrêne & Legendre, 1997), but with minor changes (Roberts, 2016). The identification of species that indicate or characterise a given habitat or community is a core concept in ecology and biogeography. The strength of an indicator species comes from the degree to which it represents a single group of typology and the relative abundance of that species within the sites of a given group.
This duality represents the specificity and fidelity of a species within its environment (Legendre & Legendre, 2012b). The simplified calculation of the Indicator Value for a given species is as follows. For each species jin each cluster of sitesk, the specificity (𝐴𝑘𝑗) as a measure of abundance and the
fidelity (𝐵𝑘𝑗) as a measure of presence are calculated as:
Where the specificity is defined as:
𝐴𝑘𝑗 = 𝑁𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠𝑘𝑗/𝑁𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠+𝑘
And fidelity is defined as:
𝐵𝑘𝑗= 𝑁𝑠𝑖𝑡𝑒𝑠𝑘𝑗/𝑁𝑠𝑖𝑡𝑒𝑠𝑘+
Therefore: