Use of Machine Learning (ML)
2.7 Conclusions: Future Outlook and Topics Awaiting Research and Application for Machine Learning (ML)
ML has existed already for over 30 years but it awaits its conservation and sustain- ability applications while ML as a discipline keeps rapidly developing. With the advent of cloud computing, global data availability, and increased computational speed, the use and applications of ML are likely to expand (see OpenModeler for an
Fig. 2.8 “Naive” GLM. While this model is comparable to the “naive” RF model in terms of RMSE statistic, the underlying spatial structure in the response (stemming from variable “x3”) remained undetected. Moran’s I = 0.53 statistic for the raw residual values confirmed that consider- able unexplained spatial structure remains in the model errors
example: open-modeling.sourceforge.net/ or TensorFlow: https://www.tensorflow.
org/). Application of ML to a wider range of problems will not only lead to ecologi- cal and management insights, but will also contribute to a better understanding of the role ML methods can play in data exploration, pattern recognition, and robust prediction (Jean et al. 2016). The use of ML to model presence only data has already received a lot of attention, with interesting applications emerging in the study of disease outbreaks (e.g., Herrick et al. 2014 for a global Avian Influenza model pre- diction) and for reliably assessing conditions in under-sampled regions (Mullet et al. 2016 for soundscapes). There are many other research possibilities and direc- tions for further development of ML, of which demographic modeling, incorpora- tion of latent variables, ensemble forecasting, tighter integration with GIS software, and the distillation of ‘best practices’ will yield important insights.
An important field in ecology that has so far not benefited much from ML applications is demographic modeling. Because the analysis of survival and fecun- dity are at the core of understanding population trajectories, we would welcome further developments that integrate powerful ML algorithms in models examining the survival probability of animals. Random forests models are beginning to be used
Fig. 2.9 “Naive” RF. While this model is comparable to the “naive” GLM model in terms of RMSE statistic, the underlying spatial structure in the response was well represented. Moran’s I = 0.17 indicated that some unexplained spatial structure remained, however
in survival-based analyses (Ishwaran 2008; see also www.stat. berkeley.
edu/~breiman/RandomForests/), but capture-mark-recapture models with random forests do not exist yet. The survival forests that have been developed are not avail- able for population viability analysis because they focus more on identifying impor- tant variables affecting survival than on estimating the probability of survival with a good prediction as the main goal. The discipline of demography is widely held back by a software (MARK and its derivatives and practitioners) originating from the 1980s. In ecology, these random survival forests are therefore only useful for
‘known-fate’ datasets, such as satellite-tracking where the survival state of an ani- mal is known with certainty. We are not aware of any ML applications to mark- recapture data that attempt to distinguish between survival and capture probabilities in a predictive context, yet.
A similar worthy application field for ML is distance sampling and spatially- explicit mark-recapture models, where the model fit (detection curve and function) and the spatial predictions are currently not obtained with ML and ensemble meth- ods. Based on our experience, the use of ML will change, usually improve, many of the currently used estimates and the related wildlife management (Huettmann et al.
2011 and Fox et al. 2017 for pelagic seabirds).
Further, ecological latent variable models are another field that may benefit greatly from the incorporation of ML algorithms. These models recognize that detection of wildlife is always imperfect, and provide a framework to estimate detection probability either from ancillary data (e.g. distance sampling: www.
ruwpa.st-and.ac.uk/distance/) or from repeated visits (e.g. Occupancy analysis:
Presence software: www.mbr-pwrc.usgs.gov/software/presence.html). Approaches to incorporate boosted regression trees into these latent variable models have been developed (Hutchinson et al. 2011), but we would welcome further research to make ML techniques more widely accessible for models that address the problem of imperfect detection.
In an earlier review, Clemen (1989) pointed out the strengths of combining pre- dictions from different models, particularly when different forecasting models cap- ture different aspects of the information available for prediction. For this reason, ensemble forecasting is an appealing approach to predict ecological responses in the future or at unsurveyed locations. Recent ecological studies (Araujo and New 2007;
Buisson et al. 2009; Jones-Farrand et al. 2011; Hardy et al. 2011; Oppel et al. 2012;
Kandel et al. 2015) have usually reaffirmed the performance gains associated with ensemble prediction, and they are becoming easier to implement using R packages such as ‘ssdm’ (Schmitt et al. 2017). As discussed throughout this paper, ML meth- ods should be at the center of any ensemble forecast by virtue of their flexibility in modeling non-linear relationships, as well as their ability to evaluate large, poten- tially messy data sets. Ensemble models speak to the simple paradigm in ML “Many weak learners make for a strong learner”. The typical hurdles facing frequentist techniques, e.g., having to specify a model structure a priori, continually determin- ing whether model assumptions are met, and evaluating which of a set of candidate models make “the most sense”, are widely avoided with ML. While a blend of methods could be a reasonable way to proceed (see e.g., Hothorn et al. 2006, 2011),
we caution that this may also limit the results that can be obtained (as noted by Breiman 2001a), e.g. when poor performers like Linear Models (LMs) are included in the ensemble (as in the default settings of the BIOMOD R package). At the very least, further research is needed in this area.
Despite the impressive advantages of ML methods, they remain underutilized in GIS, modeling, and ecological policy arenas (Huettmann 2007b). In the first and second case, easy access to convenient linkages with GIS software may still be obstructing wider adoption of these tools. Maxent does have a GIS interface (www.
cs.princeton.edu/~schapire/maxent/; the same is true for BIOMAPPER), however, easily implementable and available code with direct and adjustable links between ArcGIS and ML (Humphries et al. unpublished for Maxent) remains elusive for random forests (Humphries and Huettmann in prep.). Even if the actual model inter- face exists, having high-quality GIS layers with high spatial resolution still remains a problem (but see Oppel and Huettmann 2010, and Worldclim and similar public data sets in Herrick et al. 2014 made available worldwide). Regarding the third issue of limited application in the policy arena, we argue that ML techniques may suffer from the same difficulties as the more conventional techniques, i.e., a perception that they are too complicated to understand or based on unreasonable assumptions.
The buy-in of mathematical tools is generally not only driven by researchers but by the public at large, students and even more so, whether industry and their attorneys, lawyers and courts are trained and fluent in such methods and actually use them.
While future research will help to clarify how research studies can be designed, executed and promoted to capitalize on ML strengths (see for instance Magness et al. 2008; Lawler et al. 2011), the growing number of impressive ML case studies will help to raise awareness of the usefulness of ML techniques and reduce any prejudices that may exist. Future research synthesizing ML ‘best practices’, as well as relevant ethical issues (Daly 1997; Naess 1997; Czech 2000; Ott 2005), open- minded statistics (Hilborn and Mangel 1997; Strobl et al. 2007; Kelling et al. 2009;
Schaub and Kery 2012; Azoulay et al. 2015), data sharing (Bluhm et al. 2010;
Huettmann 2011; Zuckerberg et al. 2011) and Open Source, education and out- reach, will help to communicate the ways in which ML empirically identifies the key signals in data while simultaneously making no a priori assumptions about the data structure.
Ironically, the main ‘shortcoming and failure’ of ML so far is that it is so widely underutilized and largely unexplored in the natural sciences. It can be argued that this is partially a result of the way philosophy of science, statistics and its relation- ship to quantitative modeling and with ‘nature’ is taught and awarded in universities and society. As Breiman (2001a) so eloquently expressed, this inevitably influences all sorts of data-based decision making, ranging from publication policies, peer review, and management policy, to courts of law. We encourage high schools, uni- versities and policy-makers to incorporate ML into their standard lecture and lab material for undergraduate and graduate courses to facilitate a better understanding and more rapid adoption of these powerful approaches, as well as a knowledge of best practices (Hochachka et al. 2012; Cushman and Huettmann 2010; Drew et al. 2011).
In summary, we foresee a significant role and relevance for ML in guiding the sustainable science-based management of global biodiversity. It can become the computational standard and benchmark against which other analytical methods are measured. As stated in Huettmann (2007a, b) and elsewhere, now is the time to make the best use of these efficient tools and to share the associated expertise, methods, and general philosophy globally. We believe that a wider use of these methods will necessarily improve wildlife management frameworks, both locally and globally.
Acknowledgements This is a shared MS summarizing work efforts from over 2 decades on inter- national projects. FH is grateful to all individuals who were open-minded enough to develop and try machine learning algorithms and to support them. The late R. O’Connor and A.W. Diamond are thanked for introducing us to CARTs early on. J.Liu kindly helped to start a co-authored model session at IALE-U.S. in 2007 on such subjects, published with Springer. Salford Systems Ltd., D.Steinberg and his great team, are specifically thanked for the long collaboration, for ideas and for helpful support using their thoughts and their software in many ways. Most EWHALE students heroically supported machine learning projects, either helping to evaluate the paradigms of statis- tics, or putting themselves out there for the debate and advancement of conservation science and management with machine learning; finding new knowledge and information. FH is further grate- ful to S. Linke, L. Strecker, and to the ArcOD project (B. Bluhm), Alaska GAP project (T. Gotthard), SNAP (N. Fresco et al.), Antarctic Biogeography Atlas project (B. Danis, C. Broyer, Philiippi et al.), Red Panda project (G. Regmi, K. Kamal, MS et al), the Chinese Crane and Bustard projects (G. Yumin and students like H. Juang, M. Chunrong, P. Guopanlian), J. Morton, S. Cushman, J. Evans, T. Hegel, J. Ritter, D. Watts, A. Drew, Y. Wiersma, W. Thogmartin, T. Gottschalk, B. Raymond, B. Walther, I. Presse and H. Berrios for general support, publications, replies, and advice regarding machine learning implementations and applications. This is EWHALE publica- tion # 125.
References
Anderson D, Burnham K (2002) Avoiding pitfalls when using information-theoretic methods.
J Wildl Manag 66:912–918
Anderson D, Burnham K, Thompson W (2000) Null hypothesis testing: problems, prevalence, and an alternative. J Wildl Manag 64:912–923
Anderson DR, Link WA, Johnson D, Burnham KP (2001) Suggestions for presenting the results of data analysis. USGS Northern Prairie Wildlife Research Center. Paper 227. https://digitalcom- mons.unl.edu/usgsnpwrc/227
Archer KJ, Kimes RV (2008) Empirical characterization of random forest variable importance measures. Comput Stat Data Anal 52:2249–2260
Araujo M, New B (2007) Ensemble forecasting of species distributions. Trends Ecol Evol 22:42–47 Arnold TW (2010) Uninformative parameters and model selection using Akaike’s information
criterion. J Wildl Manag 74:1175–1178
Azoulay P, Fons-Rosen C, Zivin JSG (2015) Does science advance one funeral at a time? National Bureau of Economic Research Working Paper Series. No. 21788. http://www.nber.org/papers/
w21788
Baldwin RA (2009) Use of maximum entropy modeling in wildlife research. Entropy 11:854–866.
https://doi.org/10.3390/e11040854
Betts MG, Ganio L, Huso M, Som N, Huettmann F, Bowman J, Wintle BW (2009) Comment on
“Methods to account for spatial autocorrelation in the analysis of species distributional data: a review”. Ecography 32:374–378
Bluhm B, Watts D, Huettmann F (2010) Free database availability, metadata and the internet:
an example of two high latitude components of the census of marine life. In: Cushman SA, Huettmann F (eds) Spatial complexity, informatics and wildlife conservation. Springer, Tokyo, pp 233–244
Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen J, Stevens MHH, White J-SS (2009) Generalized linear mixed models: a practical guide for ecology and evolution. Trends EcolEvol 24:127–135
Booms T, Huettmann F, Schempf P (2009) Gyrfalcon nest distribution in Alaska based on a predic- tive GIS model. Pol Biol 33:1602–1612
Booms T, Lindgren M, Huettmann F (2011) Linking Alaska's predicted climate, Gyrfalcon, and ptarmigan distributions in space and time: a unique 200-year perspective. In: Watson RT, Cade TJ, Fuller M, Hunt G, Potapov E (eds) Gyrfalcons and ptarmigan in a changing world, vol I. The Peregrine Fund, Boise, pp 177–190
Boyce MS, Vernier PR, Nielsen SE, Schmiegelow FKA (2002) Evaluating resource selection func- tions. Ecol Model 157:281–300
Braun CE (ed) (2005) Techniques for wildlife investigations and management. The Wildlife Society (TWS), Bethesda
Breiman L (2001a) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16:199–231
Breiman L (2001b) Random forests. Mach Learn J 45:5–32
Brewer MJ, Butler A, Cooksley SL (2016) The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity. Meth Ecol Evol 7:679–692
Bruijning M, Visser MD, Hallmann CA, Jongejans E (2018) Trackdem: automated particle tracking to obtain population counts and size distributions from videos in R. Meth Ecol Evol 9:965–973. https://doi.org/10.1111/2041-210X.12975
Buechley ER, Şekercioğlu ÇH (2016) The avian scavenger crisis: looming extinctions, trophic cascades, and loss of critical ecosystem functions. Biol Conserv 198:220–228
Buisson L, Thuiller W, Casajus N, Sovan L, Grenouillet G (2009) Uncertainty in ensem- ble forecasting of species distribution. Glob Chang Biol 16:1145–1157. https://doi.
org/10.1111/j.1365-2486.2009.02000.x
Bureau A, Dupuis J, Falls K, Lunetta KL, Hayward B, Keith TP, Van Eerdewegh P (2005) Identifying SNPs predictive of phenotype using random forests. Genet Epidemiol 28:171–182
Burnham K, Anderson D (2002) Model selection and multimodel inference: a practical information- theoretic approach. Springer, New York
Buchanan GM, Lachmann L, Tegetmeyer C, Oppel S, Nelson A, Flade M (2011) Identifying the potential wintering sites of the globally threatened Aquatic Warbler Acrocephalus paludicola using remote sensing, Ostrich 82:2, 81–85. https://doi.org/10.2989/00306525.2011.603461 Buston PM, Elith J (2011) Determinants of reproductive success in dominant pairs of clownfish: a
boosted regression tree analysis. J Anim Ecol 80:528–538
Clemen RT (1989) Combining forecasts: a review and annotated bibliography. Int J Forecast 5:559–583
Craig E, Huettmann F (2008) Using “blackbox” algorithms such as TreeNet and Random Forests for data-mining and for finding meaningful patterns, relationships and outliers in complex eco- logical data: an overview, an example using golden eagle satellite data and an outlook for a promising future. In: Wang H-f (ed) Intelligent data analysis: developing new methodologies through pattern discovery and recovery. IGI Global, Hershey, pp 65–84
Cooper GF, Aliferis CF, Ambrosino R, Aronis J, Buchanan BG, Caruana R, Fine MJ, Glymour C, Gordon G, Hanusa BH et al (1997) An evaluation of machine-learning methods for predicting pneumonia mortality. Artif Intell Med 9:107–138
Crookston NL, Finley AO (2008) yaImpute: an R package for kNN imputation. J Stat Softw 23:1–14
Cushman S, Huettmann F (eds) (2010) Spatial complexity, informatics and wildlife conservation.
Springer, Tokyo
Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88:2783–2792
Czech B (2000) Shoveling fuel for a runaway train: errant economists, shameful spenders, and a plan to stop them all. University of California Press, Berkeley
Daly H (1997) Beyond growth: the economics of sustainable development. Beacon Press, Boston Dhar V (1998) Data mining in finance: using counterfactuals to generate knowledge from organi-
zational information systems. Inf Syst 23:423–437
De’ath G, Fabricius K (2000) Classification and regression trees: a powerful yet simple tech- nique for ecological data analysis. Ecology 81:3178–3192. https://doi.org/10.1890/0012-9658 (2000)081[3178:CARTAP]2.0.CO;2
De’ath G (2002) Multivariate regression trees: a new technique for modeling species–environment relationships. Ecology 83:1105–1117. https://doi.org/10.1890/0012-9658(2002)083[1105:MRT ANT]2.0.CO;2
De’ath G (2007) Boosted trees for ecological modeling and prediction. Ecology 88:243–251 Di Minin E, Fink C, Tenkanen H, Hiippala T (2018) Machine learning for tracking illegal wildlife
trade on social media. Nat Ecol Evol 2:406–407. https://doi.org/10.1038/s41559-018-0466-x Dormann CF, McPherson JM, Araújo MB, Bivand R, Bolliger J, Carl G, Davies RG, Hirzel A, Jetz
W, Kissling WD (2007) Methods to account for spatial autocorrelation in the analysis of spe- cies distributional data: a review. Ecography 30:609–628
Drew CA, Yo W, Huettmann F (eds) (2011) Predictive modeling in landscape ecology. Springer, New York
Edrén SMC, Wisz MS, Teilmann J, Dietz R, Söderkvist J (2010) Modelling spatial patterns in har- bour porpoise satellite telemetry data using maximum entropy. Ecography 33:698–708 Elith J, Graham C, NCEAS working group (2006) Novel methods improve prediction of species’
distributions from occurrence data. Ecography 29:129–151
Elith J, Ferrier S, Huettmann F, Leathwick J (2005) The evaluation strip: a new and robust method for plotting predicted responses from species distribution models. Ecol Model 186:280–289 Elith J, Leathwick JR, Hastie T (2008) A working guide to boosted regression trees. J Anim Ecol
77:802–813. https://doi.org/10.1111/j.1365-2656.2008.01390.x
Elith J, Leathwick JR (2009) Species distribution models: ecological explanation and prediction across space and time. Ann Rev Ecol Evol Syst 40:677–697
Elith J, Phillips SJ, Hastie T, Dudík M, En Chee Y, Yates CCJ (2011) A statistical explanation of MaxEnt for ecologists. Div Distrib 17:43–57
Ellis N, Smith SJ, Pitcher JR (2012) Gradient forests: calculating importance gradients on physical predictors. Ecology 93(1):156–168. http://www.esajournals.org/doi/abs/10.1890/0012-9658 (2002)083%5B1105:MRTANT%5D2.0.CO%3B2
Evans J, Murphy M, Cushman S, Holden Z (2011) Modeling tree distribution and change using random forests. In: Drew CA, Wiersma Y, Huettmann F (eds) Predictive wildlife and habitat modeling in landscape ecology. Springer Publishers, New York
Fox CH, Huettmann F, Harvey GKA, Morgan KH, Robinson J, Williams R, Paquet PC (2017) Predictions from machine learning ensembles: marine bird distribution and density on Canada’s Pacific coast. Mar Ecol Prog Ser 566:199–216
Jones-Farrand DT, Fearer TM, Thogmartin WE, Thompson FR 3rd, Nelson MD, Tirpak JM (2011) Comparison of statistical and theoretical habitat models for conservation planning: the benefit of ensemble prediction. Ecol Appl 21:2269–2282
Fielding AH, Bell JF (1997) A review of methods for the assessment of prediction errors in conser- vation presence/absence models. Environ Conserv 24:38–49
Fernandez-Delgado M, Cernades E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15:3133–3181
Fielding AH (1999) Machine learning methods for ecological applications. Springer, New York Fink D, Hochachka WM, Zuckerberg B, Winkle DW, Shaby B, Munson MA, Hooker G, Riedewald
G, Sheldon D, Kelling S (2010) Spatiotemporal exploratory models for broad-scale survey data. Ecol Appl 20:2131–2147
Fortin M-J, Dale MRT, Bertazzon S (2010) Spatial analysis of wildlife distribution and disease spread. In: Huettmann F, Cushman S (eds) Spatial complexity, informatics, and wildlife con- servation. Springer, Tokyo, pp 255–273
Friedman JH (2002) Stochastic gradient boosting. Comp Stat Data Anal 38:367–378
Galindo J, Tamayo P (2000) Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications. Comput Econ 15:107–143
Galipaud M, Gillingham MAF, David M, Dechaume-Moncharmont F-X (2014) Ecologists overes- timate the importance of predictor variables in model averaging: a plea for cautious interpreta- tions. Methods Ecol Evol 5:983–991
Garton EO, Ratti JR, Giudice JH (2005) Research and experimental design. In: Braun CE (ed) Techniques for wildlife investigations and management. The Wildlife Society, Bethesda, pp 43–71
Gillies CS, Hebblewhite M, Nielsen SE, Krawchuk M, Aldridge CL, Frair JL, Saher DJ, Stevens CE, Jerde CL (2006) Application of random effects to the study of resource selection by ani- mals. J Anim Ecol 75:887–898
Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3:95–99 Guilford T, Meade J, Willis J, Phillips RA, Boyle D, Roberts S, Collett M, Freeman R, Perrins, C
(2009) Migration and stopover in a small pelagic seabird, the Manx shearwater Puffinus puffi- nus: insights from machine learning. Proc R Soc Lond B Biol Sci: rspb 2008.1577
Guthery FS (2008) Statistical ritual; versus knowledge accrual in wildlife science. J Wildl Manag 72:1872–1875
Guthery FS, Lusk JJ, Peterson MJ (2001) The fall of the null hypothesis: liabilities and opportuni- ties. J Wildl Manag 65:379–384
Guthery FS, Brennan LA, Peterson MJ, Lusk LL (2005) Information theory in wildlife science:
critique and viewpoint. J Wildl Manag 69:457–465
Han X, Huettmann F, Guo Y, Mi C, Wen L (2018) Conservation prioritization with machine learn- ing predictions for the black-necked crane Grus nigricollis, a flagship species on the Tibetan Plateau for 2070. Glob Environ Chang. https://doi.org/10.1007/s10113-018-1336-4
Hardy SM, Lindgren M, Konakanchi H, Huettmann F (2011) Predicting the distribution and eco- logical niche of unexploited snow crab (Chionoecetesopilio) populations in Alaskan waters: a first open-access ensemble model. Integr Comp Biol 51:608–622. https://doi.org/10.1093/icb/
icr102
Harrell FE Jr (2001) Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. Springer, New York
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, infer- ence, and prediction, 2nd edn. Springer, New York
Hastie T, Fithian W (2013) Inference from presence-only data; the ongoing controversy. Ecography 36:864–867
Hegel T, Cushman SA, Evans J, Huettmann F (2010) Chapter 16: Current state of the art for statis- tical modelling of species distributions. In: Cushman S, Huettmann F (eds) Spatial complexity, informatics and wildlife conservation. Springer, Tokyo, pp 273–312
Hernandez PA, Graham CH, Master LL, Albert D (2006) The effect of sample size and species characteristics on performance of different species distribution modeling methods. Ecography 29:773–785
Herrick KA, Huettmann F, Lindgren MA (2014) A global model of avian influenza pre- diction in wild birds: the importance of northern regions. Vet Res 44:42. https://doi.
org/10.1186/1297-9716-44-42.
Hervías S, Henriques A, Oliveira N, Pipa T, Cowen H, Ramos JA, Nogales M, Geraldes P, Silva C, de Ruiz Ybáñez R, Oppel S (2013) Studying the effects of multiple invasive mammals on Cory’s shearwater nest survival. Biol Invasions 15:143–155
Hijmans RJ (2012) Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model. Ecology 93:679–688
Hilborn R, Mangel M (1997) The ecological detective: confronting models with data. Princeton University Press, Princeton, p 330