Conclusion - From Data Mining with Machine Learning to Inference in Diverse and Highly Complex

From Data Mining with Machine Learning to Inference in Diverse and Highly Complex

4.8 Conclusion

Carlson D (2013) Reading and thinking about international polar years: five recent books. Polar Res 32:1–7. https://doi.org/10.3402/polar.v32i0.20789CODATA

Chamberlin T (1890) The method of multiple working hypothesis. Reprinted 1965. Science 148:754–759

Chia KS (1997) “Significant-itis”—an obsession with the P-value. Scand J Work Environ Health 23:152–154

Concato J, Hartigan JA (2016) P values: from suggestion to superstition. J Investig Med 64:1166–1171

Conner CD (2005) A People’s history of science: miners, midwives and “low Mechanicks”. Nation books, New York

Cushman, S. and F. Huettmann. (2010) Spatial Complexity, Informatics and Wildlife Conservation.

Springer Tokyo, Japan. 448 p.

Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88:2783–2792

Czech B (2002) Shoveling fuel for a runaway train: errant economists, shameful spenders, and a plan to stop them all. University of California Press, Berkeley

Daly H, Farley J (2010) Ecological economics. Principles and applications, 2nd edn. Island Press, New York

De’ath G, Fabricius K (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192 10.1890/0012-9658(2000)081[3178:CAR TAP]2.0.CO;2

Dodds DG (2001) Philosophy and practice of wildlife management, 3rd edn. Krieger Publications, New York

Drew CA, Yo W, Huettmann F (eds) (2011) Predictive modeling in landscape ecology. Springer, New York.

Elith J, Graham C, NCEAS working group (2006) Novel methods improve prediction of species’

distributions from occurrence data. Ecography 29:129–151.

Elder JF (2003) The generalization paradox of ensembles. J Comput Graph Stat 12:853–864 Fernandez-Delgado M, Cernadas E, Barro S, Dinan A (2014) Do we need hundreds of classifiers

to solve real world classification problems. J Mach Learn 15:3133–3181

Fidler F, Loftus GR (2009) Why figures with error bars should replace p values: some conceptual arguments and empirical demonstrations. J Psychol 217:27–37. https://faculty.washington.edu/

gloftus/Downloads/Fidler.Loftus.pdf

Filliben JJ (1975) The probability plot correlation coefficient test for normality. Technometrics.

Am Soc Qual 17:111–117

Fox CH, Huettmann F, Harvey GKA, Morgan KH, Robinson J, Williams R, Paquet PC (2017) Predictions from machine learning ensembles: marine bird distribution and density on Canada’s Pacific coast. Marine Ecology Progress Series 566:199–216.

Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an appli- cation to boosting. J Comput Syst Sci 55:119–139

Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378

Fryell JM, Caughley G (2014) Wildlife ecology, conservation, and management. Wiley Blackwell, Brisbane

Gardner MA, Altman DG (1986) Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med J 292:746–750

Gelman A (2008) Objections to bayesian statistics. Bayesian Anal 3:445–450

Georgescu-Roegen N (1971) The entropy law and the economic process. Harvard University Press, Cambridge, MA

Gigerenzer G (2004) Mindless statistics. J Socio Econ 33:587–606

Greenland S (2012) Transparency and disclosure, neutrality and balance: shared values or just shared words? J Epidemiol Community Health 66:967–970

Greenland S, Senn SK, Rothman KJ, Carlin JB, Poole C, Goodman SN, Altman DG (2016) Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol 31:337–350

Guthery FS (2008) Statistical ritual; versus knowledge accrual in wildlife science. J Wildl Manag 72:1872–1875

Guthery FS, Lusk JJ, Peterson MJ (2001) The fall of the null hypothesis: liabilities and opportunities. J Wildl Manag 65:379–384

Guthery FS, Brennan LA, Peterson MJ, Lusk LL (2005) Information theory in wildlife science:

critique and viewpoint. J Wildl Manag 69:457–465

Han X, Huettmann F, Guo Y, Mi C, Wen L (2018) Conservation prioritization with machine learning predictions for the black-necked crane Grus nigricollis, a flagship species on the Tibetan plateau for 2070. Glob Environ Chang. https://doi.org/10.1007/s10113-018-1336-4

Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York

Hilborn R, Mangel M (1997) The ecological detective: confronting models with data. Princeton University Press, Princeton, p 330

Hobbs NT, Hooten M (2015) Bayesian models: a statistical primer for ecologists. Princeton University Press, Princeton

Hochachka W, Caruana R, Fink D, Munson A, Riedewald M, Sorokina D, Kelling S (2007) Data mining for discovery of pattern and process in ecological systems. J Wildl Manag 71:2427–2437 Huettmann F (2005) Databases and science-based management in the context of wildlife and habitat: towards a certified ISO standard for objective decision-making for the global community by using the internet. J Wildl Manag 69:466–472

Huettmann F (2007) Modern adaptive management: adding digital opportunities towards a sustainable world with new values. Forum Public Policy Clim Chang Sustain Dev 3:337–342 Huettmann F (2009) The global need for, and appreciation of, high-quality metadata in biodiver-

sity work. In: Spehn E, Koerner C (eds) Data mining for global trends in mountain biodiversity.

CRC Press, Taylor & Francis, pp 25–28

Huettmann F, Artukhin Y, Gilg O, Humphries G (2011) Predictions of 27 Arctic pelagic seabird distributions using public environmental variables, assessed with colony data: a first digital IPY and GBIF open access synthesis platform. Mar Biodivers 41:141–179. https://doi.org/10.1007/

s12526-011-0083-2

Johnson CJ, Seip DR (2008) Relationship between resource selection, distribution, and abundance:

a test with implications to theory and conservation. Popul Ecol 50:145–157

Kandel K, Huettmann F, Suwal MK, Regmi GR, Nijman V, Nekaris KAI, Lama ST, Thapa A, Sharma HP, Subedi TR (2015) Rapid multi-nation distribution assessment of a charismatic conservation species using open access ensemble model GIS predictions: red panda (Ailurus fulgens) in the Hindu-Kush Himalaya region. Biol Conserv 181:150–161

Lambdin C (2012) Significance tests as sorcery: science is empirical—significance tests are not.

Theory Psychol 22:67–90

Loftus GR (1996) Psychology will be a much better science when we change the way we analyze data. Curr Dir Psychol 5:161–171

Magness DR, Huettmann F, Morton JM (2008) Using random forests to provide predicted species distribution maps as a metric for ecological inventory & monitoring programs. pp 209–229. In:

Smolinski TG, Milanova MG, Hassanien A-E (eds) Applications of computational intelligence in biology: current trends and open problems. Studies in computational intelligence, vol 122.

Springer-Verlag, Berlin/Heidelberg, p 428

Magness DR, Morton JM, Huettmann F, Chapin FS III, McGuire AD (2011) A climate-change adaptation framework to reduce continental-scale vulnerability across conservation reserves.

Ecosphere 2:art112. https://doi.org/10.1890/ES11-00200.1

Manly FJ, McDonald LL, Thomas DL, McDonald TL, Erickson WP (2002) Resource selection by animals: statistical design and analysis for field studies, 2nd edn. Kluwer Academic Publishers, Dordrecht

McArdle (1988) The structural relationship: regression in biology. Can J Zool 66:2329–2339 McCullough BC, Wilson B (1999) On the accuracy of statistical procedures in Microsoft Excel 97.

Comput Stat Data Anal 31:27–37

McGarical K, Cushman S, Stafford S (2000) Multivariate statistics for wildlife and ecology research. Springer, New York

Mi C, Huettmann F, Guo Y, Han X, Wen L (2017) Why to choose random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence. Peerj. https://doi.org/10.7717/peerj.2849

Mueller JP,Massaron L (2016) Machine learning for dummies. For Dummies Publisher, 435 p Næss A (1989) Ecology, community and lifestyle: outline of an ecosophy (trans: Rothenberg D).

Cambridge: Cambridge University Press

O’Connor R (2000) Why ecology lags behind biology. The Scientist 14:35–36

Oppel S, Strobl C, Huettmann F (2009). Alternative methods to quantify variable importance in ecology. Technical report number 65, Department of Statistics. University of Munich Oppel S, Meirinho A, Ramírez I, Gardner B, O’Connell A, Miller PI, Louzao M (2012) Comparison

of five modelling techniques to predict the spatial distribution and abundance of seabirds. Biol Conserv 156:94–104.

Pearce J, Ferrier S (2000) Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Model 133:225–245

Perneger TV (1998) What’s wrong with Bonferroni adjustments. Br Med J 316:1236–1238 Popper K (1945) The open society and its enemies. Princeton University Press, Princeton/Oxford Quinn G, Keough M (2004) Experimental design and data analysis for biologists. Cambridge

University Press, Cambridge

Razali N, Wah YB (2011) Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–darling tests. J Stat Model Anal 2:21–33

Reinhart A (2015) Statistics done wrong: the woefully complete guide. No Starch Press, San Francisco

Rexstad EA, Miller D, Flather C, Anderson A, Hupp J, Anderson JR (1988) Questionable multivariate statistical inference in wildlife and community studies. J Wildl Manag 52:794–798 Romesburg HC (1981) Wildlife science: gaining reliable knowledge. J Wildl Manag 45:293–313 Romesburg HC (1991) On improving natural resources and environmental sciences. J Wildl

Manag 55:744–756

Rosales J (2008) Economic growth, climate change, biodiversity loss: distributive justice for the global north and south. Conserv Biol 22:1409–1417

Rothman KJ (1990) No adjustments are needed for multiple comparisons. Epidemiology 1:43–46 Salsburg DS (1985) The religion of statistics as practiced in medical journals. Am Stat 39:220–223 Salsburg D (2001) The lady tasting tea: how statistics revolutionized science in the twentieth cen-

tury. W. H. Freeman and Company, New York

Savalei V, Dunn E (2015) Is the call to abandon p-values the red herring of the replicability crisis?

Front Psychol. https://doi.org/10.3389/fpsyg.2015.00245

Sawitzki G (1994a) Testing numerical reliability of data analysis systems. Comput Statist Data Anal 18:269–286

Sawitzki G (1994b) Report on the reliability of data analysis systems. Comput Statist Data Anal (SSN) 18:289–301

Schapire RE (1990) The strength of weak learnability. Machine learning, vol 5. Kluwer Academic Publishers, Boston, MA, pp 197–227. https://doi.org/10.1007/bf00116037

Schapire RE (1992) The design and analysis of efficient learning algorithms. MIT Press, Cambridge, MA

Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictors.

Mach Learn 37:297–336

Schmidt FL, Hunter JE (2014) Methods of meta-analysis: correcting error and bias in research findings, 3rd edn. Sage Publisher, Thousand Oaks

Silva NJ (2012) The wildlife techniques manual: research & management, vol 2, Seventh edn. The Johns Hopkins University Press

Stang A, Poole C, Kuss O (2010) The ongoing tyranny of statistical significance testing in bio- medical research. Eur J Epidemiol 25:225–230

Stanton-Geddes J, Gomes De Freitas C, de Sales Dambros C (2014) In defense of P values: com- ment on the statistical methods actually used by ecologists. Ecology 95:637–642

Stephens PA, Buskirk SW, Hayward GW, Martinez Del Rio C (2007) A call for statistical plural- ism answered. J Appl Ecol 44:461–463. https://doi.org/10.1111/j.1365-2664.2007.01302.x Strobl C, Boulesteix A-L, Zeileis A, Hothorn T (2007) Bias in random forest variable impor-

tance measures: illustrations, sources and a solution. Bioinformatics 8:25. https://doi.

org/10.1186/1471-2105-8-25

Swihart R, Slade N (1985) Testing for independence in observations of animal movements.

Ecology 66:1176–1184

Thompson B (2004) The “significance” crisis in psychology and education. J Soc Econ 33:607–613 Venables WN, Ripley BD (2002) Modern applied statistical analysis, 4th edn. Springer, New York Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP (2006) Why do we still use stepwise

modelling in ecology and behaviour? J Anim Ecol 75:1182–1189

Yackulic CB, Chandler R, Zipkin EF, Royle JA, Nichols JD, Campbell Grant EH, Veran S (2012) Presence-only modeling using MAXENT: when can we trust the inferences? Meth Ecol Evol 4:236–243

Yoccoz NG (1991) Use, overuse, and misuse of significance tests in evolutionary biology and ecology. Bull Ecol Soc Am 72:106–111

Zar JH (2010) Biostatistical analysis, 5th edn. Prentice Hall, Upper Saddle River

Ziliak ST, McCloskey DN (2009) The cult of statistical significance. Section on statistical education, pp 2302–2319

Zuckerberg, B, F. Huettmann and J. Friar (2011) Proper Data Management as a Scientific Foundation for Reliable Species Distribution Modeling. In C.A. Drew, Y. Wiersma and F.

Huettmann (eds). Predictive Species and Habitat Modeling in Landscape Ecology. Springer, New York. Pp 45–70

109

G. R. W. Humphries et al. (eds.), Machine Learning for Ecology and Sustainable Natural Resource Management, https://doi.org/10.1007/978-3-319-96978-7_5

Ensembles of Ensembles: Combining

In document Machine Learning for Ecology and Sustainable Natural Resource Management (página 119-124)