The concept of ANNs is an imitation of the structure and operation of the human brain by means of mathematical models. The ANN concept is used in forecasting, by considering historical data to be the input to a black box, which contains hidden layers of neurons. These neurons compare and structure the inputs and known outputs by
Chapter 2 Literature Review 50
non-linear weightings, which are determined by a continuous learning process (back- propagation). The learning process continues until forecast outputs are reasonably close to known actual outputs. The structure of the black box is then used for forecasting actual future outputs.
ANNs have a powerful pattern recognition capability. They learn from experience through a process of back-propagation and have been used as a forecasting technique (Sharda 1994). ANNs are data driven self-adaptive methods that capture the functional relationships within the data (Zhang, Patuwo and Hu 1998) and can be described as multivariate, non-linear and non-parametric (White 1989, Ripley 1993, Cheng and Titterington 1994). The ANN approach, which has the ability to learn from experience, is very powerful in solving practical problems if large amounts of data are available.
One type of ANN is the Multi-layer Perceptron (MLP). It has several levels of nodes, each node being called a neuron and each level being referred to as a layer. A typical MLP would have an input layer, an output layer and one or more hidden layers in between the input and the output layers. Figure 2.1 shows a neural network with an input layer with two inputs x1 and x2, one hidden layer with three hidden nodes and one output Y in the output layer.
Each node has inputs and outputs. Nodes receive a weighted sum of inputs from connected units. Nodes perform a unique function that converts the inputs into an appropriate output. This function could be to generate a 1 or a 0 depending on
whether the weighted sum reached a threshold. Alternatively, a node can be programmed to perform a sigmoid, hyperbolic, or other linear or non-linear function.
Figure 2.1 Basic Structure of an Artificial Neural Network
Y Output Layer b0 b1 b2 b3 p1 p2 p3 Hidden Layer a10 a20 a30 a11 a21 a31 a12 a22 a32 Input Layer x1 x2
Warner and Misra (1996), express the output yi of neuron i, at a threshold of μi, as, yi = 1 if ( Σ aij xj - μi ) ≥ 0 ,
and yi = 0 if ( Σ aij xj - μi ) < 0 ,
where aij are the weights from neuron j to neuron i and xj are the intputs for neuron j. Klimasauskas (1991) presents a hyperbolic function for the neurons of figure 2.1 as follows, where pi are the outputs, xjthe inputs and aij the weights:
p1 = tanh (a10 + a11 x1 + a12 x2 ) ,
p2 = tanh (a20 + a21 x1 + a22 x2 ) ,
Chapter 2 Literature Review 52
and a sigmoid function as follows, for output Y where pi are the inputs and bi are the weights:
Y = 1 / ( 1 + e-( b0 + b1 p1 + b2 p2 + b3 p3 )) .
Most authors use only one hidden layer (Hornik, Stinchcombe and White 1989) and a large number of hidden nodes. Some use two hidden layers (Sirinivasan, Liew and Chang 1994) to achieve a higher efficiency in the training process but this requires additional processing power.
For time series forecasting the inputs are the past observations of the data series and the output is the future value. The connectionist method presented by Gallant (1988) and Kasabov (1996a) is the most appropriate for time series forecasting where past observations are used to forecast future values. The network in Figure 2.2 illustrates how time series data y(t) are used in a univariate connectionist method.
Figure 2.2 MLP Neural Network for Univariate Forecasting
Input Layer Hidden Layers Output Layer
y(t) : y(t+m) : y(t-k)
Figure 2.3 illustrates the use of ANNs for multivariate time series forecasting, where
y(t) is the primary series and x(t) is a secondary series such as an economic indicator.
Figure 2.3 MLP Neural Network for Multivariate Forecasting
Input Layer Hidden Layers Output Layer
y(t) : y(t-k) y(t+m) x(t) : x(t-k)
The concept of ANNs dates back to 1962 (Warner and Misra 1996). However, due to the non-availability of a training algorithm at that time for multi-layer networks, ANNs did not develop as a forecasting tool (Rumelhart 1986). By 1986 the back- propagation method had been developed giving ANNs a boost as a useful forecasting technique. By 1988 ANNs with back-propagation out performed regression and Box- Jenkins methods (Werbos, 1988). A further advantage of ANNs is that they do not limit the model to linearity. Lapedes and Farber (1987) concluded that ANNs can be used in forecasting non-linear time series. The traditional Box-Jenkins method assumes that the time series modeled by it are generated from linear processes (Box- Jenkins 1976, Pankratz 1983). The importance of non-linearity is recognised in the ARCH model (Engle 1982), but here too, a specific non-linear mathematical function
Chapter 2 Literature Review 54
has to be assumed at the outset without knowing whether it fits the data. ANNs on the other hand select a non-linear form by allowing the data to pass through its neurons, back-propagating until through a learning process a non-linear function is selected that fits the data. The superiority of ANNs is therefore noteworthy as they “have more general and flexible functional forms than traditional statistical methods” (Zhang, Patuwo and Hu, 1998). Zhang, Patuwo and Hu (1998) have made a comprehensive review of the ANN literature.
Several comparisons have been made of statistical and ANN methods (Hruschka 1993). ANNs can be used for modeling and forecasting non-linear time series with very high accuracy (Lapedes and Farber 1987). There are many financial applications where ANNs have been used in forecasting. Forecasting bankruptcy and business failure (Odom and Sharda 1990, Coleman, Graettinger and Lawrence 1991, Salchenkerger, Cinar and Lash 1992, Tam and Kiang 1992, Fletcher and Goss 1993, Wilson and Sharda 1994), foreign exchange rate (Weigend, Huberman and Rumelhart 1992, Refenes 1993, Borisov and Pavlov 1995, Kuan and Liu 1995, Wu 1995, Hann and Steurer 1996), stock prices (White 1988, Kimoto, Asakawa, Yoda and Takeoka 1990, Schoneburg 1990, Bergerson and Wunsch 1991, Yoon and Swales 1991, Grudnitski and Osburn 1993) and others (Dutta and Shekhar 1988, Sen, Oliver and Sen 1992, Wong, Wang, Goh and Quek 1992, Kryzanowski, Galler and Wright 1993, Chen 1994, Refenes, Zapranis and Francis 1994, Kaastra and Boyd 1995, Wong and Long 1995, Chiang, Urban and Baldridge 1996) are some of the financial applications of ANNs. Scott (2000) demonstrates that ANNs can enhance the predictive capabilities of the moving average cross-over technique employed by technical analysts when deciding on long or short trading strategy.
Other forecasting applications of ANNs include, commodity prices (Kohzadi, Boyd, Kermanshahi and Kaastra 1996), environmental temperature (Balestrino, Bini Verona and Santanche 1994), international airline passenger traffic (Nam and Schaefer 1995), macroeconomic indices (Maasoumi, Khotanzad and Abaye 1994), personnel inventory (Huntley 1991), rainfall (Chang, Rapiraju, Whiteside and Hwang 1991), student grade point averages (Gorr, Nagin and Szczypula 1994) and total industrial production (Aiken, Krosp, Vanjani and Govindarajulu 1995).
ANNs have been used in the field of tourism to classify tourist markets (Mazanec 1992), to forecast visitor behaviour (Pattie and Snyder 1996) and to forecast Japanese demand for travel to Hong Kong (Law and Au 1999). Fernando, Turner and Reznik (1999a) used ANNs successfully to forecast tourist flows to Japan from the USA. Uysal and El Roubi (1999) compared ANNs with regression analysis in tourism demand modelling. Law (2000) concluded that back-propagation ANNs out performed regression models and time series models in predicting Taiwanese demand for travel to Hong Kong. Burger et al. (2001) compared neural networks with several time series techniques to predict tourism demand from the US to Durban and concluded that the neural network method performed the best. They also found that the 12 months ahead forecast performed better than the 3 and 6 months ahead forecasts due to seasonal bias. Cho (2003) found neural network models better than ARIMA and exponential smoothing in forecasting visitor arrivals to Hong Kong from USA, Japan, Taiwan, Korea, UK and Singapore.
Chapter 2 Literature Review 56
2.4.1 Periodic and Non-Periodic Models
In a periodic model the data of a particular season are isolated from data of other seasons to build a model for that season and forecast for that season only. While periodic models would have less data available for modeling and testing (one fourth the data for quarterly series and one 12th the data for monthly series), there is a case for periodic forecasting as seasonal patterns can be isolated by a periodic model. Fernando, Reznik and Turner (1998), successfully used a periodic neuro-fuzzy model to forecast tourist arrivals to Australia. The Turner, Kulendran and Fernando (1997a) results show that the AR model with periodic data produced better forecasts than the ARIMA model with non-periodic seasonal data.
However, when models other than the AR and ARIMA were considered Turner, Kulendran and Fernando (1997a) concluded that periodic models do not increase the accuracy of forecasts. The Turner, Kulendran and Fernando (1997a) study was comparing the Holt-Winters, ARIMA and the basic structural models. It may well be that seasonal flows are not independent of the season. Consequently, this research does not use periodic models in general, however, periodic data have been used with the neural network model to test whether periodic data will make a difference to the accuracy of neural network forecasts.