UNA HERMANDAD NO TELEDIRIGIDA 49 - EL AÑO DE LA TRÍADA: 2010

We have presented the first application and evaluation of Monte Carlo cross- validation as a method for producing combinations of univariate time series forecasts within a general framework of cross-validation aggregation (crogging). This general framework for combining autoregressive forecasts draws inspiration from bagging. Bagging which is based on bootstrap resampling is used for forecast combination as a relatively easy way to improve accuracy of an existing model and has grown in popularity over the last two decades. Several cross-validation sampling schemes are evaluated for combining forecasts including 2-fold, 10-fold, leave-one-out and Monte Carlo cross-validation.

Beyond Monte Carlo cross-validation, this study provides evidence that crogging which is relatively simple to implement, turns out to be even more effective at reducing the variance of the final forecast, and improving accuracy, in comparison to bagging with a fixed validation set, and neural network model averaging. Where bagging is implemented using a validation set, the ‘out-of-bag’ approach, then bagging performs just as good as Monte Carlo crogging. This is first shown theoretically through a decomposition of the mean squared error into its bias and variance components, which are both estimated through Monte and BagMoob

Secondly using competition data consisting of monthly real business time series, it was found that Monte-Carlo crogging and BagMoob, outperformed k-fold crogging and other benchmark

methods, making Monte-Carlo a valid and attractive alternative for time series forecasting with autoregressive models.

With the exception of BagFoob the number of hidden nodes used was found to have no

statistical impact on forecast accuracy. However for short series and large combination sizes, k-fold and BagFoob were outperformed by both Monte-Carlo crogging and BagMoob. In contrast

for long time series little statistical difference was noted among the four principal methods evaluated. Our results show evidence of better in-sample performance on the validation set by Monte-Carlo versus BagMoob, however this resulted in no improvement in terms of model

selection of sample size and hidden node parameters even when using the validation set. This can be explained by the relative invariace of both methods to both paramaters. While Monte- Carlo crogging and BagMoob are most effective at larger sampling sizes, k-fold crogging

performed poorly; attributable to the trade-off between the number of folds, and the number of time series observations available for training. As the number of folds increases, and likewise the number of forecasts combined, the size of the validation dataset decreaces. This suggests that with fewer observations available for early stopping, model estimation becomes poorer. This appears to be the biggest advantage that the Monte-Carlo and bagging with ‘out- of-bag’ methods afford, in that they decouples the number of samples available for forecast model estimation from the size of the training/validation datasets.

While this study has shown that crogging can be used to improve forecasting accuracy when applied to neural networks, further evidence is required using other forecasting methods e.g. regression to which crogging can be easily applied. Additionally, an obvious next step would be to investigate the application of crogging beyond autoregressive models to

moving average processes as proposed by Bergmeir and Hyndman (2014) for bagging exponential smoothing methods.

References

Aksu, C., and S. I. Gunter. 1992. An empirical analysis of the accuracy of SA, OLS, ERLS and NRLS combination forecasts. International Journal of Forecasting 8 (1):27-43. Altman, NS. 1990. Kernel smoothing of data with correlated errors. Journal of the American

Statistical Association 85 (411):749-759.

Arlot, Sylvain, and Alain Celisse. 2010. A survey of cross-validation procedures for model selection. Statistics Surveys 4 (0):40-79.

Assaad, M., R. Bone, and H. Cardot. 2008. A new boosting algorithm for improved time- series forecasting with recurrent neural networks. Information Fusion 9 (1):41-55. Bates, J. M., and C. W. J. Granger. 1969. Combination of Forecasts. Operational Research

Quarterly 20 (4):451-&.

Ben Taieb, Souhaib, and Rob Hyndman. 2014. Boosting multi-step autoregressive forecasts. Paper read at Proceedings of The 31st International Conference on Machine Learning. Bengio, Yoshua, and Yves Grandvalet. 2004. No unbiased estimator of the variance of k-fold

cross-validation. The Journal of Machine Learning Research 5:1089-1105.

Berardi, V. L., and G. P. Zhang. 2003. An empirical investigation of bias and variance in time series forecasting: Modeling considerations and error evaluation. Ieee Transactions on Neural Networks 14 (3):668-679.

Bergmeir, Christoph, and Rob J Hyndman. 2014. Bagging Exponential Smoothing Methods using STL Decomposition and Box-Cox Transformation. Monash University, Department of Econometrics and Business Statistics.

Birgé, Lucien, and Pascal Massart. 2007. Minimal penalties for Gaussian model selection. Probability Theory and Related Fields 138 (1-2):33-73.

Breiman, L. 1984. Classification and regression trees: Wadsworth International Group. Breiman, L. 1996a. Bagging predictors. Machine Learning 24 (2):123-140.

Breiman, L. 1996b. Out-of-bag estimation. Technical report, Statistics Department, University of California Berkeley, Berkeley CA 94708, 1996b. 33, 34.

Breiman, L. 2001. Random forests. Machine learning, 45(1): 5-32.

Brenning, A., J. Andrey, and B. Mills. 2011. Indirect modeling of hourly meteorological time series for winter road maintenance. Environmetrics 22 (3):398-408.

Brown, Gavin, Jeremy Wyatt, Rachel Harris, and Xin Yao. 2005. Diversity creation methods: a survey and categorisation. Information Fusion 6 (1):5-20.

Burman, Prabir. 1989. A comparative study of ordinary cross-validation, v-fold cross- validation and the repeated learning-testing methods. Biometrika 76 (3):503-514. Burman, Prabir, Edmond Chow, and Deborah Nolan. 1994. A cross-validatory method for

dependent data. Biometrika 81 (2):351-358.

Burman, Prabir, and Deborah Nolan. 1992. Data-dependent estimation of prediction functions. Journal of Time Series Analysis 13 (3):189-207.

Chen, Z., and Y. Yang. 2004. Assessing forecast accuracy measures. In Working Paper. Clarida, R. H., L. Sarno, M. P. Taylor, and G. Valente. 2003. The out-of-sample success of

term structure models as exchange rate predictors: a step beyond. Journal of International Economics 60 (1):61-83.

Clemen, R. T., and R. L. Winkler. 1986. Combining economic forecasts. Journal of Business & Economic Statistics 4 (1):39-46.

Clements, Michael P., and David F. Hendry. 2007. An Overview of Economic Forecasting. In A Companion to Economic Forecasting, edited by D. F. H. Michael P. Clements. Cleveland, Robert B, William S Cleveland, Jean E McRae, and Irma Terpenning. 1990. STL:

A seasonal-trend decomposition procedure based on loess. Journal of Official Statistics 6 (1):3-73.

Crone, Sven F., Michèle Hibon, and Konstantinos Nikolopoulos. 2011. Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction. International Journal of Forecasting 27 (3):635-660.

de Menezes, L. M., D. W. Bunn, and J. W. Taylor. 2000. Review of guidelines for the use of combined forecasts. European Journal of Operational Research 120 (1):190-204. Deng, Y. F., X. Jin, Y. X. Zhong, and Ieee. 2005. Ensemble SVR for prediction of time series,

Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9. New York: Ieee.

Dietterich, Thomas G. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10 (7):1895-1923.

Dietterich, ThomasG. 2000. Ensemble Methods in Machine Learning. In Multiple Classifier Systems: Springer Berlin Heidelberg.

Donate, J. P., P. Cortez, G. G. Sanchez, and A. S. de Miguel. 2013. Time series forecasting using a weighted cross-validation evolutionary artificial neural network ensemble. Neurocomputing 109:27-32.

Efron. 1983. Estimating the Error Rate of a Prediction Rule - Improvement on Cross- Validation. Journal of the American Statistical Association 78 (382):316-331.

Efron, Bradley. 1979. 1977 Rietz Lecture - Bootstrap Methods - Another Look at the Jackknife. Annals of Statistics 7 (1):1-26.

Efron, Bradley, and Robert Tibshirani. 1986. Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science:54-75.

Efron, Bradley, and Robert Tibshirani. 1993. An Introduction to the bootstrap. London: Chapman and Hall.

Geman, S., E. Bienenstock, and R. Doursat. 1992. Neural networks and the bias variance dilemma. Neural Computation 4 (1):1-58.

Hagan, M.T., H.B. Demuth, and M.H. Beale. 1996. Neural Network Design: Pws Pub.

Hansen, Bruce E., and Jeffrey S. Racine. 2012. Jackknife model averaging. Journal of Econometrics 167 (1):38-46.

Hansen, L. K., and P. Salamon. 1990. Neural Network Ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (10):993-1001.

Hardle, Wolfgang, and James Stephen Marron. 1985. Optimal bandwidth selection in nonparametric regression function estimation. The Annals of Statistics:1465-1481. Hart, Jeffrey D. 1991. Kernel regression estimation with time series errors. Journal of the

Royal Statistical Society. Series B (Methodological):173-187.

Hart, Jeffrey D, and Thomas E Wehrly. 1986. Kernel regression estimation using repeated measurements data. Journal of the American Statistical Association 81 (396):1080- 1088.

Hillebrand, E., and M. C. Medeiros. 2010. The Benefits of Bagging for Forecast Models of Realized Volatility. Econometric Reviews 29 (5-6):571-593.

Hornik, K. 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4 (2):251-257.

Hornik, K., M. Stinchcombe, and H. White. 1989. Multilayer Feedforward Networks Are Universal Approximators. Neural Networks 2 (5):359-366.

Hu, M. Y., G. Q. Zhang, C. Z. Jiang, and B. E. Patuwo. 1999. A cross-validation analysis of neural network out-of-sample performance in exchange rate forecasting. Decision Sciences 30 (1):197-216.

Hu, Michael Y., Guoqiang Zhang, Christine X. Jiang, and B. Eddy Patuwo. 1999. A Cross- Validation Analysis of Neural Network Out-of-Sample Performance in Exchange Rate Forecasting. Decision Sciences 30 (1):197-216.

Hyndman, Rob J., and Anne B. Koehler. 2006. Another look at measures of forecast accuracy. International Journal of Forecasting 22 (4):679-688.

Inoue, A., and L. Kilian. 2008. How useful is bagging in forecasting economic time series? A case study of US consumer price inflation. Journal of the American Statistical Association 103 (482):511-522.

Jose, V. R. R., and R. L. Winkler. 2008. Simple robust averages of forecasts: Some empirical results. International Journal of Forecasting 24 (1):163-169.

Kohavi, Ron. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2. Montreal, Quebec, Canada: Morgan Kaufmann Publishers Inc.

Kourentzes, Nikolaos, Devon K. Barrow, and Sven F. Crone. 2014. Neural network ensemble operators for time series forecasting. Expert Systems with Applications 41 (9):4235- 4244.

Krogh, Anders, and Jesper Vedelsby. 1995. Neural network ensembles, cross validation, and active learning. Advances in neural information processing systems:231-238.

Kunsch, H. R. 1989. The Jackknife and the Bootstrap for General Stationary Observations. Annals of Statistics 17 (3):1217-1241.

Macdonald, R., and I. W. Marsh. 1994. Combining exchange-rate forecasts - what is the optimal consensus measure. Journal of Forecasting 13 (3):313-332.

Makridakis, S., A. Andersen, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E. Parzen, and R. Winkler. 1982. The accuracy of extrapolation (time series) methods: Results of a forecasting competition. Journal of Forecasting 1 (2):111-153.

Makridakis, S., and M. Hibon. 2000. The M3-Competition: results, conclusions and implications. International Journal of Forecasting 16 (4):451-476.

Michaelsen, J. 1987. Cross-Validation in Statistical Climate Forecast Models. Journal of Climate and Applied Meteorology 26 (11):1589-1600.

Naftaly, U., N. Intrator, and D. Horn. 1997. Optimal ensemble averaging of neural networks. Network-Computation in Neural Systems 8 (3):283-296.

Opsomer, Jean, Yuedong Wang, and Yuhong Yang. 2001. Nonparametric regression with correlated errors. Statistical Science:134-153.

Perrone, Michael P., and Leon N. Cooper, eds. 1992. When networks disagree: ensemble methods for hybrid neural networks. Edited by R. J. Mammone, Neural Networks for Speech and Image processing: Chapman-Hall.

Picard, Richard R, and R Dennis Cook. 1984. Cross-validation of regression models. Journal of the American Statistical Association 79 (387):575-583.

Shao, J. 1997. An asymptotic theory for linear model selection. Statistica Sinica 7 (2):221- 242.

Shao, L. 1993. Linear model selection via cross-validation. Journal of the American Statistical Association 88 (422):486-494.

Sorić, Petar, and Ivana Lolić. 2013. A note on forecasting euro area inflation: leave-h-out cross validation combination as an alternative to model selection. Central European Journal of Operations Research:1-10.

Stock, James H., and Mark W. Watson. 2004. Combination forecasts of output growth in a seven-country data set. Journal of Forecasting 23 (6):405-430.

Stone, M. 1974. Cross-Validatory Choice and Assessment of Statistical Predictions. Journal of the Royal Statistical Society. Series B (Methodological) 36 (2):111-147.

Stone, M. 1977. An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike's Criterion. Journal of the Royal Statistical Society. Series B (Methodological) 39 (1):44-47.

Taieb, S. B., Bontempi, G., Atiya, A. F., & Sorjamaa, A. 2012. A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert systems with applications, 39(8): 7067-7083.

Taieb, S. B., & Atiya, A. F. 2015. A Bias and Variance Analysis for Multistep-Ahead Time Series Forecasting. IEEE transactions on neural networks and learning systems. Tashman, J. 2000. Out-of-sample tests of forecasting accuracy: an analysis and review.

International Journal of Forecasting 16 (4):437-450.

Terasvirta, T., and H. M. Anderson. 1992. Characterizing nonlinearities in business cycles using smooth transition autoregressive models. Journal of Applied Econometrics 7 (S1):S119-S136.

Timmermann, A. 2006. Forecast combinations. In Handbook Of Economic Forecasting, edited by G. Elliott, C. W. J. Granger and A. Timmermann. Amsterdam Elsevier. Watson, James H. Stock and Mark W. 2005. An empirical comparison of methods for

forecasting using many predictors. Princeton University.

Wolff, C. C. P. 1987. Time-Varying Parameters and the out-of-Sample Forecasting Performance of Structural Exchange-Rate Models. Journal of Business & Economic Statistics 5 (1):87-97.

Zhang, G. P., and V. L. Berardi. 2001. Time series forecasting with neural network ensembles: an application for exchange rate prediction. Journal of the Operational Research Society 52 (6):652-664.

Zhang, G. Q., B. E. Patuwo, and M. Y. Hu. 1998. Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting 14 (1):35-62.

Zhang, Ping. 1993. Model selection via multifold cross validation. The Annals of Statistics:299-313.

Zhou, Zhi-Hua, Jianxin Wu, and Wei Tang. 2002. Ensembling neural networks: Many could be better than all. Artificial Intelligence 137 (1–2):239-263

In document EL AÑO DE LA TRÍADA: 2010 (página 194-198)