Discussion - Machine Learning for Ecology and Sustainable Natural Resource Management

74 Predictors

8.4 Discussion

Our study set out to explain the ecological niche of Siberian crane in summer. It is based on a unique set of 74 environmental predictors which is rare to have available and to see employed in wildlife and ecological niche studies (Mi et al. 2017 for applications; see Herrick et al. 2013 for 40 layers, and Sriram and Huettmann unpublished using 104 GIS layers). Further, this is the first large-scale predictive model of Siberian crane distributions in the nesting grounds, as well as for the Siberian crane overall, and specifically for the high Arctic. The use of ‘batteries’ as a machine learning method is also a novum for this subject and for conservation management overall.

We show that the model with the top univariate predictor (Fig. 8.4) widely over- estimates the range, beyond known presence locations and over-reaching pseudo- absences, when compared with the best model fit with the data and as expressed by the performance metric. Our results are in good agreement with studies that show parsimony as a failure for inference (e.g. Guthery et al. 2005; Elith et al. 2006;

Arnold 2010; Mi et al. 2017).

The best possible map shows the birds to be mostly coastal, with a few locations along the rivers. The map shows a rather small available nesting area by now, and which is in support of indicating a conservation problem for this species; it is not a wide-spread species in the high arctic. Whether this is a new situation or a response to dramatic population declines needs more study.

We show that the so-called ‘kitchen sink model’ (all 74 predictors) in the TreeNet algorithm performs best for the Siberian crane when using a large number of environmental predictors. We believe that this presents a major result and progress for

“model selection” as a scientific scheme (Chamberlin 1890; Akaike 1974) which is:

Table 8.4 Visual assessment and rank of predictive performance metrics for the individual battery models

Model name Rank Justification and meaning

Kitchen sink model 1 All presences well fit

TMax12 9 Large overestimation of presences

BIO14 10 Large overestimation of presences, misses some presences

TMax12BIO14 8 Overpredicts and mis-predicts absences

Top5 7 Presences predicted very tight, some islands likely overpredicted

Top10 5 Presences predicted very tight

Top29 2 Decent presence regions

Top35 4 Some slight overpredictions

Bottom 44 6 Presence points well included, but some wider overprediction areas

Leaving out top 3 interacting predictors

3 A few overpredictions

TreeNet (machine learning, boosting) performs in its default settings and with

‘maximum numbers’ of predictors as one of the best solutions for data mining and predictions (e.g. Elith et al. 2006). However, for learning about model structure and how individual predictors perform, data structure, batteries prove rather insightful and for learning about the data cube. That way we were able to locate additional predictor and predictor arrangements to the usual set and learn about their contribu- tions. It made the modeling more informative and robust, and consequently the inference became better for conservation. From our findings one can easily assess and design protected area locations for this species in the breeding grounds.

One of the biggest surprises of this study probably was that 44 of the least relevant predictors perform almost as well as the top predictors. This is called predictor swapping and has major implications. This puts many questions towards traditional model selection (Burnham and Anderson 2002) as well as model selection, the sup- posed meaning of univariate predictor selections, predictor ranking, identification and use when using ‘the best’ predictor and such narrow interpretations. Instead of individually ranked predictors and parsimonious ones (as promoted by Burnham and Anderson 2002), we argue that a set of spatial predictors ‘does the job’ pretty well, if not equally well and when predictions and such inference are the goal (as per Breiman 2001). In other words, model fitting cannot achieve well (as stated by McArdle 1988 and others) and instead the multivariate perspective is much more powerful, informative and less biased than the univariate and groomed pre-made selection of predictors (as promoted in Manly et al. 2002 and Silvy 2012 for instance; but compare with McGarical et al. 2000).

When it comes to predictions, and inference from those settings (as per Breiman 2001), batteries show an improvement and increased insight for non-parsimonious solutions (see Fig. 8.4).

Fig. 8.4 Heatmap of the ‘best possible’ Siberian crane prediction, based on the ‘kitchen sink model’ of 74 environmental predictors using TreeNet. Red shows the highest predicted Relative Index of Occurrence (RIO), pink dots show compiled presence locations. For details see Methods

Despite their great promise and potential for ecological insight, battery applications in wildlife conservation are conspicuous by their absence. Here we just used

‘shaving’ from the wide list of batteries available (Fig. 8.5). Table 8.5 shows 28 types of batteries that can be run in SPM, for instance.

We think that batteries provide a powerful extension of traditional machine learning methods. They provide more insight into the data cube and model selected.

As the reader will almost not be able to find relevant wildlife conservation publica- tions on batteries (as listed in Table 8.4) they are currently almost unused for wildlife conservation and climate applications. More study is recommended using batteries and their variations (Table 8.4). Arguably, this will result in further assessment and decay of the AIC argument and of parsimony overall (Guthery et al. 2005;

Arnold 2010), instead favoring models that address interactions and better predictions (all fully in line with Breiman 2001).

We suggest to use batteries as an informative and powerful exploratory tool to learn more about the underlying model structure and predictors for inference. Here, it has provided powerful inference on nesting Siberian cranes in the Russian high arctic. This species is widely overlooked for international conservation research and can benefit greatly from more large-scale studies for advanced conservation management.

Fig. 8.5 Heatmap of the ‘most parsimonious’ Siberian crane prediction, based on the predictor

‘TMAx12’ using TreeNet. A comparison with Fig. 8.4 shows its shortfalls and overprediction. Red shows the highest predicted Relative Index of Occurrence (RIO), pink dots show compiled presence locations. For details see Methods

Table 8.5 Overview of 28 batteries available in SPM; many of them await their testing for wildlife conservation

Battery name Explanation (taken from SPM7) Comment AddedVar Treenet added Var battery

Additive Moves through the list of predictors, selecting one predictor at a time

Additive models based on machine learning

Bootstrap Repeat with new learn sample (draw) An additional bootstrap to boosting and bagging (which have versions of bootstrapping implemented) CV Number of folds in cross-validation Traditional cross-validation CVBIN Creates a number of cross-validation,

with binning defined by the several discrete variables

A more specific CV

CVR Repeat CV with different random seeds A further specific CV Datashift Roll learn and test samples Rolls through data Draw Repeat with new learn sample draw

(replacement)

Another version of bootstrapping Flip Reverse roles of learn and test samples An innovative version of re-sampling Keep Select predictors at random, run and

repeat, may include some required predictors

Random draw of predictors to include

Learnrate (for Treenet only)

Learnrate Learnrate can be

LOVO Drop one predictor and repeat for all in keep list

‘Mills’ through the entire dataset in a detailed fashion, one-by-one MCT Model overfitting test via Monte Carlo

simulation

An often expressed concern for machine learning, but rarely a problem for bagging, for instance Minchild Size of smallest allowable terminal node This is an essential and sensitive test

for most tree-based models Nodes Maximum number of terminable nodes

allowed

This allows to test for node depth, and indirectly, for interactions Oneoff One-predictor model for each predictor

in the keep list

Specified univariate model predictor test run

Partition Repeat with new learn, test and holdout samples drawn from the ‘main”data

An innovative approach to subsampling

Pboot Parametric bootstrap models Follows parametric assumptions in subsampling

Sample Measure effect of learn sample size on error rate

Assesses subsampling effect Seed Randomforest seed This could matter for bagging to find

‘the best’ model Shaving Drop least important predictor, re-run

and repeat

A powerful approach to assess unimportant and important predictors in a multivariate approach

Stepwise Builds model by forward-stepwise selection of predictors

Classic forward step-wise approach to modeling

(continued)

Acknowledgement We thank Dan Steinberg and Salford Systems Ltd. for a workshop with U.S. IALE at Snowbird, Utah, to introduce us to the power of batteries. FH acknowledges the kind and long collaboration with the Forestry University of Beijing, China, and the use of their data.

U.S. IALE and S. Linke, C. Cambu, H. Hera, H. Berrios Alvarez and the -EWHALE lab- at UAF, are thanked for their support. This is EWHALE lab publication #185.

Appendix 1: Details of 74 GIS Environmental layers Used

In document Machine Learning for Ecology and Sustainable Natural Resource Management (página 182-186)