• No se han encontrado resultados

As a logical potential improvement and extension of the weighted average of forecasts Granger and Ramanathan (1984) proposed that ordinary least squares (OLS) regression should be used with the actual variable as the dependent variable and the forecasts as the

12

predictor variables. They argued that using weights constrained to sum to one was unnecessarily restrictive and suggested that OLS regression would ensure a better fit which would result in presumably better forecasting performance.

OLS regression with stepping routines to extract the optimal subset of predictor variables is an effective and efficient technique to achieve model parsimony, a typical feature of models with superior out-of-sample forecasts (Diebold, 1998). Stepping routines have the important function of eliminating significant predictors and a non-significant intercept from the final OLS regression model which is essential for deriving the optimal linear combination of the forecasts to be consolidated (Clemen, 1986).

Forecasts that are excluded from the final consolidation model is said to have been encompassed by the forecasts that are included in the final consolidation model (Newbold & Harvey, 2004).

It is important to point out that where the variable being predicted is not quantitative, then OLS cannot be used to combine the forecasts. When the dependent variable is ordinal and polychotomous, logit regression has been recommended as a suitable consolidation technique (Kamstra, Kenedy & Suan, 2001).

Another scenario where OLS cannot be used as a consolidation technique is when there is a set of dependent variables being predicted. For this scenario the recommended technique is canonical correlation (Barnston, 1994; Chu & He, 1994; Barnston & Smith, 1996; Shabber & Barnston, 1996; Landman & Klopper, 1998).

The issues that need to be addressed when using OLS regression for the consolidation of forecasts are discussed in the following paragraphs.

Given that the forecasts to be consolidated will typically be at least moderately

correlated, the problems associated with multicollinearity require attention. The major problems are an increased likelihood of rounding errors in the calculation of the regression coefficients resulting in very inaccurate coefficients and confusing and misleading regression coefficients, e.g. non-significant regression coefficients for a forecast that is actually strongly correlated with the dependent variable (Kleinbaum, Kupper & Muller, 1988; Mendenhall & Sincich, 2003). Although multicollinearity is a bigger issue when conducting regression for the purpose of explanation than when the

13

purpose is prediction, which is the case when consolidating forecasts, addressing the issue may nevertheless be a worthwhile exercise to significantly improve the accuracy of the consolidated forecasts. Recommended solutions for the problem that are relevant for the task at hand include: stepwise regression, ridge regression, partial least squares regression and transforming the data to reduce the correlations between the predictor variables, e.g. subtracting the mean of the forecasts for an observed record from the value of the actual variable and the value of each of the relevant forecasts.

Whilst ridge regression has been shown to effectively address the problem of

multicollinearity when consolidating forecasts (Peña & Van den Dool, 2008), a simpler strategy that in addition to multicollinearity also addresses the issue of heteroscedasticity is to use backward and forward stepping in the regression analysis. Using stepping routines also plays an indispensible role in the quest for model parsimony, an important principle of sound model building (McLeod, 1993).

Other questions that need to be answered when conducting regression analysis with historical data, which is the case when consolidating forecasts, are what to use as the cut-off date for the data to include in the analysis, whether more weight should be attached to more recent data. Weighted least squares regression is a possible solution for these problems (Mendenhall & Sincich, 2003) if older records are assigned

progressively smaller weights than more recent forecasts.

Data displaying significant temporal growth will probably introduce the issue of heteroscedasticity (Mendenhall & Sincich, 2003); an increase over time will result in increased forecast errors assuming that the relative accuracy levels of the forecasts remain the same. The proposed remedy for heteroscedasticity is usually to transform the data with a suitable function such as √ for Poisson data, sin √ for Binomial

proportions and ln(y) for data from multiplicative models (Mendenhall & Sincich, 2003). Another solution that is appropriate for time series data is to apply the inverse function of the inherent growth curve to make the time series stationary. However, the removal of growth only partially addresses the issue of heteroscedasticity. What remains is fluctuation in the data caused by the effect of various time-periods, e.g. the effect of month of the year, day of the week and hour of the day. These fluctuations can be damped by applying seasonal correction factors which are calculated as either the

14

difference ( ̅ ̅) or the ratio ̅ ̅⁄ where ̅ is the overall mean for the series and ̅ the overall mean for time-period i. When a time series is subject to the effect of multiple effects, it seems more likely that their combined effect will be multiplicative in which case the seasonal correction factors should be based on the ratios, for example if the effect of month of year, day of the week and hour of the day are to be removed then the seasonal correction factor for a month i, day j and hour k observation is given by

where the F’s are calculated as the ratio between the mean for the overall time series and the overall mean for the relevant time-period.

2.4 Loss Functions as Measures of Forecasts Consolidation Tools’ Performance

Documento similar