From Data Mining with Machine Learning to Inference in Diverse and Highly Complex
4.8 Conclusion
Carlson D (2013) Reading and thinking about international polar years: five recent books. Polar Res 32:1–7. https://doi.org/10.3402/polar.v32i0.20789CODATA
Chamberlin T (1890) The method of multiple working hypothesis. Reprinted 1965. Science 148:754–759
Chia KS (1997) “Significant-itis”—an obsession with the P-value. Scand J Work Environ Health 23:152–154
Concato J, Hartigan JA (2016) P values: from suggestion to superstition. J Investig Med 64:1166–1171
Conner CD (2005) A People’s history of science: miners, midwives and “low Mechanicks”. Nation books, New York
Cushman, S. and F. Huettmann. (2010) Spatial Complexity, Informatics and Wildlife Conservation.
Springer Tokyo, Japan. 448 p.
Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88:2783–2792
Czech B (2002) Shoveling fuel for a runaway train: errant economists, shameful spenders, and a plan to stop them all. University of California Press, Berkeley
Daly H, Farley J (2010) Ecological economics. Principles and applications, 2nd edn. Island Press, New York
De’ath G, Fabricius K (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192 10.1890/0012-9658(2000)081[3178:CAR TAP]2.0.CO;2
Dodds DG (2001) Philosophy and practice of wildlife management, 3rd edn. Krieger Publications, New York
Drew CA, Yo W, Huettmann F (eds) (2011) Predictive modeling in landscape ecology. Springer, New York.
Elith J, Graham C, NCEAS working group (2006) Novel methods improve prediction of species’
distributions from occurrence data. Ecography 29:129–151.
Elder JF (2003) The generalization paradox of ensembles. J Comput Graph Stat 12:853–864 Fernandez-Delgado M, Cernadas E, Barro S, Dinan A (2014) Do we need hundreds of classifiers
to solve real world classification problems. J Mach Learn 15:3133–3181
Fidler F, Loftus GR (2009) Why figures with error bars should replace p values: some conceptual arguments and empirical demonstrations. J Psychol 217:27–37. https://faculty.washington.edu/
gloftus/Downloads/Fidler.Loftus.pdf
Filliben JJ (1975) The probability plot correlation coefficient test for normality. Technometrics.
Am Soc Qual 17:111–117
Fox CH, Huettmann F, Harvey GKA, Morgan KH, Robinson J, Williams R, Paquet PC (2017) Predictions from machine learning ensembles: marine bird distribution and density on Canada’s Pacific coast. Marine Ecology Progress Series 566:199–216.
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an appli- cation to boosting. J Comput Syst Sci 55:119–139
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378
Fryell JM, Caughley G (2014) Wildlife ecology, conservation, and management. Wiley Blackwell, Brisbane
Gardner MA, Altman DG (1986) Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med J 292:746–750
Gelman A (2008) Objections to bayesian statistics. Bayesian Anal 3:445–450
Georgescu-Roegen N (1971) The entropy law and the economic process. Harvard University Press, Cambridge, MA
Gigerenzer G (2004) Mindless statistics. J Socio Econ 33:587–606
Greenland S (2012) Transparency and disclosure, neutrality and balance: shared values or just shared words? J Epidemiol Community Health 66:967–970
Greenland S, Senn SK, Rothman KJ, Carlin JB, Poole C, Goodman SN, Altman DG (2016) Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol 31:337–350
Guthery FS (2008) Statistical ritual; versus knowledge accrual in wildlife science. J Wildl Manag 72:1872–1875
Guthery FS, Lusk JJ, Peterson MJ (2001) The fall of the null hypothesis: liabilities and opportuni- ties. J Wildl Manag 65:379–384
Guthery FS, Brennan LA, Peterson MJ, Lusk LL (2005) Information theory in wildlife science:
critique and viewpoint. J Wildl Manag 69:457–465
Han X, Huettmann F, Guo Y, Mi C, Wen L (2018) Conservation prioritization with machine learn- ing predictions for the black-necked crane Grus nigricollis, a flagship species on the Tibetan plateau for 2070. Glob Environ Chang. https://doi.org/10.1007/s10113-018-1336-4
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, infer- ence, and prediction, 2nd edn. Springer, New York
Hilborn R, Mangel M (1997) The ecological detective: confronting models with data. Princeton University Press, Princeton, p 330
Hobbs NT, Hooten M (2015) Bayesian models: a statistical primer for ecologists. Princeton University Press, Princeton
Hochachka W, Caruana R, Fink D, Munson A, Riedewald M, Sorokina D, Kelling S (2007) Data mining for discovery of pattern and process in ecological systems. J Wildl Manag 71:2427–2437 Huettmann F (2005) Databases and science-based management in the context of wildlife and habi- tat: towards a certified ISO standard for objective decision-making for the global community by using the internet. J Wildl Manag 69:466–472
Huettmann F (2007) Modern adaptive management: adding digital opportunities towards a sustain- able world with new values. Forum Public Policy Clim Chang Sustain Dev 3:337–342 Huettmann F (2009) The global need for, and appreciation of, high-quality metadata in biodiver-
sity work. In: Spehn E, Koerner C (eds) Data mining for global trends in mountain biodiversity.
CRC Press, Taylor & Francis, pp 25–28
Huettmann F, Artukhin Y, Gilg O, Humphries G (2011) Predictions of 27 Arctic pelagic seabird distributions using public environmental variables, assessed with colony data: a first digital IPY and GBIF open access synthesis platform. Mar Biodivers 41:141–179. https://doi.org/10.1007/
s12526-011-0083-2
Johnson CJ, Seip DR (2008) Relationship between resource selection, distribution, and abundance:
a test with implications to theory and conservation. Popul Ecol 50:145–157
Kandel K, Huettmann F, Suwal MK, Regmi GR, Nijman V, Nekaris KAI, Lama ST, Thapa A, Sharma HP, Subedi TR (2015) Rapid multi-nation distribution assessment of a charismatic conservation species using open access ensemble model GIS predictions: red panda (Ailurus fulgens) in the Hindu-Kush Himalaya region. Biol Conserv 181:150–161
Lambdin C (2012) Significance tests as sorcery: science is empirical—significance tests are not.
Theory Psychol 22:67–90
Loftus GR (1996) Psychology will be a much better science when we change the way we analyze data. Curr Dir Psychol 5:161–171
Magness DR, Huettmann F, Morton JM (2008) Using random forests to provide predicted species distribution maps as a metric for ecological inventory & monitoring programs. pp 209–229. In:
Smolinski TG, Milanova MG, Hassanien A-E (eds) Applications of computational intelligence in biology: current trends and open problems. Studies in computational intelligence, vol 122.
Springer-Verlag, Berlin/Heidelberg, p 428
Magness DR, Morton JM, Huettmann F, Chapin FS III, McGuire AD (2011) A climate-change adaptation framework to reduce continental-scale vulnerability across conservation reserves.
Ecosphere 2:art112. https://doi.org/10.1890/ES11-00200.1
Manly FJ, McDonald LL, Thomas DL, McDonald TL, Erickson WP (2002) Resource selection by animals: statistical design and analysis for field studies, 2nd edn. Kluwer Academic Publishers, Dordrecht
McArdle (1988) The structural relationship: regression in biology. Can J Zool 66:2329–2339 McCullough BC, Wilson B (1999) On the accuracy of statistical procedures in Microsoft Excel 97.
Comput Stat Data Anal 31:27–37
McGarical K, Cushman S, Stafford S (2000) Multivariate statistics for wildlife and ecology research. Springer, New York
Mi C, Huettmann F, Guo Y, Han X, Wen L (2017) Why to choose random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence. Peerj. https://doi.org/10.7717/peerj.2849
Mueller JP,Massaron L (2016) Machine learning for dummies. For Dummies Publisher, 435 p Næss A (1989) Ecology, community and lifestyle: outline of an ecosophy (trans: Rothenberg D).
Cambridge: Cambridge University Press
O’Connor R (2000) Why ecology lags behind biology. The Scientist 14:35–36
Oppel S, Strobl C, Huettmann F (2009). Alternative methods to quantify variable importance in ecology. Technical report number 65, Department of Statistics. University of Munich Oppel S, Meirinho A, Ramírez I, Gardner B, O’Connell A, Miller PI, Louzao M (2012) Comparison
of five modelling techniques to predict the spatial distribution and abundance of seabirds. Biol Conserv 156:94–104.
Pearce J, Ferrier S (2000) Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Model 133:225–245
Perneger TV (1998) What’s wrong with Bonferroni adjustments. Br Med J 316:1236–1238 Popper K (1945) The open society and its enemies. Princeton University Press, Princeton/Oxford Quinn G, Keough M (2004) Experimental design and data analysis for biologists. Cambridge
University Press, Cambridge
Razali N, Wah YB (2011) Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–darling tests. J Stat Model Anal 2:21–33
Reinhart A (2015) Statistics done wrong: the woefully complete guide. No Starch Press, San Francisco
Rexstad EA, Miller D, Flather C, Anderson A, Hupp J, Anderson JR (1988) Questionable mul- tivariate statistical inference in wildlife and community studies. J Wildl Manag 52:794–798 Romesburg HC (1981) Wildlife science: gaining reliable knowledge. J Wildl Manag 45:293–313 Romesburg HC (1991) On improving natural resources and environmental sciences. J Wildl
Manag 55:744–756
Rosales J (2008) Economic growth, climate change, biodiversity loss: distributive justice for the global north and south. Conserv Biol 22:1409–1417
Rothman KJ (1990) No adjustments are needed for multiple comparisons. Epidemiology 1:43–46 Salsburg DS (1985) The religion of statistics as practiced in medical journals. Am Stat 39:220–223 Salsburg D (2001) The lady tasting tea: how statistics revolutionized science in the twentieth cen-
tury. W. H. Freeman and Company, New York
Savalei V, Dunn E (2015) Is the call to abandon p-values the red herring of the replicability crisis?
Front Psychol. https://doi.org/10.3389/fpsyg.2015.00245
Sawitzki G (1994a) Testing numerical reliability of data analysis systems. Comput Statist Data Anal 18:269–286
Sawitzki G (1994b) Report on the reliability of data analysis systems. Comput Statist Data Anal (SSN) 18:289–301
Schapire RE (1990) The strength of weak learnability. Machine learning, vol 5. Kluwer Academic Publishers, Boston, MA, pp 197–227. https://doi.org/10.1007/bf00116037
Schapire RE (1992) The design and analysis of efficient learning algorithms. MIT Press, Cambridge, MA
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictors.
Mach Learn 37:297–336
Schmidt FL, Hunter JE (2014) Methods of meta-analysis: correcting error and bias in research findings, 3rd edn. Sage Publisher, Thousand Oaks
Silva NJ (2012) The wildlife techniques manual: research & management, vol 2, Seventh edn. The Johns Hopkins University Press
Stang A, Poole C, Kuss O (2010) The ongoing tyranny of statistical significance testing in bio- medical research. Eur J Epidemiol 25:225–230
Stanton-Geddes J, Gomes De Freitas C, de Sales Dambros C (2014) In defense of P values: com- ment on the statistical methods actually used by ecologists. Ecology 95:637–642
Stephens PA, Buskirk SW, Hayward GW, Martinez Del Rio C (2007) A call for statistical plural- ism answered. J Appl Ecol 44:461–463. https://doi.org/10.1111/j.1365-2664.2007.01302.x Strobl C, Boulesteix A-L, Zeileis A, Hothorn T (2007) Bias in random forest variable impor-
tance measures: illustrations, sources and a solution. Bioinformatics 8:25. https://doi.
org/10.1186/1471-2105-8-25
Swihart R, Slade N (1985) Testing for independence in observations of animal movements.
Ecology 66:1176–1184
Thompson B (2004) The “significance” crisis in psychology and education. J Soc Econ 33:607–613 Venables WN, Ripley BD (2002) Modern applied statistical analysis, 4th edn. Springer, New York Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP (2006) Why do we still use stepwise
modelling in ecology and behaviour? J Anim Ecol 75:1182–1189
Yackulic CB, Chandler R, Zipkin EF, Royle JA, Nichols JD, Campbell Grant EH, Veran S (2012) Presence-only modeling using MAXENT: when can we trust the inferences? Meth Ecol Evol 4:236–243
Yoccoz NG (1991) Use, overuse, and misuse of significance tests in evolutionary biology and ecol- ogy. Bull Ecol Soc Am 72:106–111
Zar JH (2010) Biostatistical analysis, 5th edn. Prentice Hall, Upper Saddle River
Ziliak ST, McCloskey DN (2009) The cult of statistical significance. Section on statistical educa- tion, pp 2302–2319
Zuckerberg, B, F. Huettmann and J. Friar (2011) Proper Data Management as a Scientific Foundation for Reliable Species Distribution Modeling. In C.A. Drew, Y. Wiersma and F.
Huettmann (eds). Predictive Species and Habitat Modeling in Landscape Ecology. Springer, New York. Pp 45–70
109
© Springer Nature Switzerland AG 2018
G. R. W. Humphries et al. (eds.), Machine Learning for Ecology and Sustainable Natural Resource Management, https://doi.org/10.1007/978-3-319-96978-7_5