• No se han encontrado resultados

VII. DISCUSIÓN

4. Resultados sobre la eficacia subjetiva

6.2.1

How well did SDMs predict plant performance?

I was limited in my choice of SDM by the study species’ traits and the characteristics of the distribution data. Because of the disjunct distributions of the three species, I chose one of the oldest SDM algorithms, BIOCLIM (Nix 1986), a simple envelope or “boxcar” technique. This method avoided under-predicting suitable habitat at the expense of a high likelihood of over-prediction. Although cross-validation techniques used in Chapter 2 indicated that the SDM performed well, the IPM revealed substantial model over-prediction on Banks Peninsula. However, BIOCLIM was able to rank sites well enough in accordance with lambda to achieve a satisfactory AUC (> 0.7). It therefore provided a useful, but over-generous estimation of suitable climate and I demonstrated a simple method of eliminating as much of this over-projection as possible.

The relatively poor performance of BIOCLIM raises the question of whether it was, in fact, the best method for modelling the three species. I considered BIOCLIM the best option for a global model of the three species’ distributions, as the disjunct distributions would cause more complex models e.g. MaxEnt (Phillips et al. 2006), to under-predict potential ranges. Since then, another presence-only method, range-bagging (Drake 2015) has been developed, which might have been superior. Training the SDMs and niche analyses on global data demonstrated that the species’ New Zealand distributions are climatically distinct from their distributions elsewhere, and that the extent of their naturalization in New Zealand could not have been predicted prior to their introduction. However, the native ranges for all three species told us little about their potential distributions in New Zealand, and in the case of C. orbiculata, led to over-prediction of suitable climate because of the inclusion of frost-tolerant high elevation populations. In retrospect, a model trained on New Zealand data only [an invasive species distribution model, or iSDM; (Václavík & Meentemeyer 2009)] could have been informative as a follow-up to the global model. Removing the native ranges from training data would not sacrifice useful information, as the climatic conditions are so distinct, and would eliminate the problem of

disjunct distributions. An iSDM would allow the use of more complex algorithms that are less likely to over-estimate potential distributions (e.g. boosted regression trees, MaxEnt).

As discussed in Chapter 5, it was not possible to determine how much over-prediction stemmed from BIOCLIM itself, versus other sources of error common to all SDM studies. To recap, other potential sources of error are quality of occurrence data, accuracy of climate layers, and scale. Occurrence data may bias models where there are errors in species identification or confusion over taxonomy (Elith et al. 2013), sampling bias [though methods exist for minimizing bias e.g. Dorazio (2014)], or where species are persisting in otherwise unsuitable locations [for example plants in cultivation, or sink populations (Warren 2012)]. Climate layers do not always reflect true site conditions, as they are interpolated from weather station data and are often inaccurate where stations are rare (Niekerk & Joubert 2011). Scale introduces error when occurrence locations, which are usually point data, are not representative of the average conditions in the grid cell. Examples of this are populations persisting in microclimates, or where there is high climatic variation within the grid cell [e.g. Kriticos et al. (2014)]. Without further investigation, this discussion remains speculative, as it is not possible to attribute the proportion of error to each source. I encourage the consideration of the aforementioned sources of error in all modelling efforts, whether using complex or simple algorithms.

6.2.2

Improving SDM accuracy

This thesis has highlighted the importance of selecting an SDM algorithm based on species’ traits and occurrence data, as well as performing prior niche analyses to identify disjunct or non-analog distributions. I have demonstrated an effective method of testing SDM projections against plant performance data, and argue for using fundamental absences in place of pseudo or true absences when modelling non-equilibrium species, if possible. However, drawing general conclusions on the accuracy of SDMs, and relative importance of sources of error, is difficult from this case study alone. More generality could be achieved by repeating the methods described in Chapter 5 for a selection of species, varying the quality of occurrence data, climate layers, grain size and SDM algorithm to identify the relative impact of each source of error on SDM performance. Although other studies have compared performance of different models and other error sources (Aguirre-Gutiérrez et al. 2013; Syfert et al. 2013; Qiao et al. 2015; Stoklosa et al. 2015), none have used fundamental absences, thus it is difficult to determine to what extent reportedly good models are simply overfitting the data. Using fundamental presences and absences to validate SDMs is appealing, but may be difficult to implement widely. To parameterize the population model, predicted vital rates are required as a function of climate (or other variables included in the SDM being tested, such as soil), and this is data- intensive. The cost of gathering the necessary data may be prohibitive, and is at odds with the primary appeal of SDMs, namely, that they are fast and cheap. If a species is well studied, however, it could be possible to gather sufficient information from existing literature to parameterize a simple population model. In this case, more thorough validation of the population model would be required if it was

parameterized using data from other regions, but it could be a viable alternative to extensive field trials. Alternatively, other studies have used performance data for one or two vital rates alone to validate models, rather than predicting population growth (Pattison & Mack 2008; Sheppard et al. 2014). Although more cost-effective than developing full population models, performance data is reliable only if the limiting vital rate is known (e.g. germination), and it is difficult to determine where vital rates become limiting without an intuitive suitable/unsuitable binary classification, such as that provided by lambda. Furthermore, a priori assumptions of limiting processes may not always be correct. For example, based on existing literature I incorrectly expected annual internodes to be a good indicator of performance in Aeonium species.

In lieu of field experiments, surveys of wild populations could provide the necessary data if accidental release of the organism is too risky (e.g. the species is a restricted organism), but only under certain conditions. This is not valid for species that are far from equilibrium, such as the early stages of invasions (Thuiller et al. 2006; Wilson et al. 2007), and sampling marginal populations is desirable for robust extrapolation (Hargreaves et al. 2014). Identifying marginal populations prior to sampling is, admittedly, difficult. As a final alternative, laboratory-based studies are cheaper and more efficient than field studies, and could provide the necessary parameters for a climate-driven IPM or similar. This would be most practical for fast-growing annual species. But, responses observed in the laboratory may translate poorly to the field. In this scenario, it might be more useful to collect physiological data to parameterise a mechanistic model, though mechanistic models are not always superior to their simpler correlative counterparts due to compounding of error (Buckley et al. 2010).

Documento similar