An ANN is a mathematical algorithm with the capability of relating the input and output variables (Table 5.3) and learning from such examples through iteration, without requir- ing prior knowledge of the relationships between process variables (Torrecilla et al., 2004, 2005); the ANN learns its internal representation from the input/output data of its environment and response. Data representative of the process is gathered and put into the training algorithms for automatic learning of the structure of the data (i.e., neu- ral networks learn by example). The network is defi ned by connections in parallel and sequences between neurons. Previous publications on heat transfer and thermal pro- cess predictions have employed neural networks to predict parameters characterizing thermal inactivation, such as process time, process lethality, associated quality factors, and surface heat transfer coeffi cients (Sablani et al., 1995, 1997; Sreekanth et al., 1999; Afaghi et al., 2001; Torrecilla et al., 2005; Chen, 2006; Chen and Ramaswamy, 2006).
In high-pressure processing, neural networks can be also employed to characterize the temperature and pressure history at specifi ed points. For example, they can pre- dict the maximum temperature reached in the sample after pressurization. In this case, the advantage is that thermophysical properties are not needed to perform the prediction (see examples in Chapter 6).
Neural networks can be developed either by using a specifi c computer language accounting for their principles, or on use of commercial neural networks software pack- ages. Development of neural network codes involves turning the theory of a particular network model into a computer simulation implementation (Chen and Ramaswamy, 2006). Commercially available neural network software packages have been developed, for example, by NeuroDimension Inc. (Gainsville, Florida) and The Mathworks Inc.
5.4.5.1 Neural Network Architecture
Neural networks consist of a set of neurons called processing units, which are arranged in several parallel layers. The most commonly used neural network archi- tecture (as explained later) is the multilayered feed-forward network layer using back-propagation of error in the learning mechanism.
The structure of this model is based on the use of three or more layers. There are two layers for input and output data and a number of hidden layers for process- ing and iterating. The input layer receives information from an external source and passes this information on to the network for processing. The hidden layer (or layers) receives and processes information from the input layer. The number of hidden layers could be one, up to three, depending on the problem in place. The output layer receives process information from the network and sends the result to an external receptor. When the input layer receives the information from an external source, it is activated, emitting signals to all neurons in the fi rst hidden layer, which will in turn transfer the signal to the next layer. Depending on the strength (weight) of the interconnections, the signals can excite or inhibit the nodes to ultimately reach the output layer and provide the prediction.
A neural network can be viewed as a “black box” into which a specifi c input to each node (or neuron) in the input layer is sent. The network is defi ned by connec- tions in parallel and sequence between hidden nodes through which information is processed. Finally, the network gives an output from the nodes to the output layer. For example, in a high-pressure process, the input layer may include the applied pressure, pressure rate, high-pressure vessel temperature, and fl uctuations in ambient tempera- ture, whereas the output layer provides predicted variables such as the maximum temperature reached after pressurization as an average temperature or specifi c vessel point temperature, depending on the type of data used during the training process.
5.4.5.2 Artifi cial Neural Network Development
The ANN can be developed according to the following steps (Torrecilla et al., 2004, 2005; Chen and Ramaswamy, 2006):
5.4.5.2.1 Selection of the Number of Layers
It has been shown that selecting one hidden layer is suffi cient to approximate any continuous nonlinear function for network training purposes (Torrecilla et al., 2004). However, more hidden layers could be used in special applications.
5.4.5.2.2 Selection of the Transfer Function between Neurons
In general, neurons can be connected to each other by weighted links, wij, over which
signals can pass. Each neuron receives multiple inputs proportional to their connection weights, generating a single output that may be propagated to several other neurons (Sreekanth et al., 1999; Torrecilla et al., 2005; Chen and Ramaswamy, 2006).
An interactive function between neurons is shown in the scheme presented in Figure 5.9. The inputs (yi) into each incoming node i are multiplied by their
corresponding connection weights (wij) and added together to yield xi:
1 i ij i i x w y = =∑ ⋅ (5.42)
This sum is then transformed by means of an input activation function, producing a single output yj, where j represents a neuron in the hidden layer, which may be passed
on to other neurons. The input activation function can be any continuous function and is typically a monotonic nondecreasing nonlinear function (e.g., hyperbolic, lin- ear threshold, Gaussian function, or sigmoid function).
5.4.5.2.3 ANN Training and Learning Step
For learning purposes, the training data set consists of pairs of input and desired output data. The input data is fed into the network and the estimated output is compared with the real output by calculating the input–output difference as the error signal. The training of the network is based on adjusting the connection parameters (w) so that the difference is minimized between the estimated output
yk of each neuron k at the output layer and the real output data (rk). The predic-
tion error Ek can be used as a comparative parameter between ANN response
and real output:
(
)
2 1 2 k k k k E = ∑ r −y (5.43)The learning rule is a method to adjust the weight factors based on trial and error.
Chapter 6 provides two examples (Torrecilla et al., 2004, 2005) that use the “error- correction learning” as a learning rule, and the back propagation algorithm to automatically adjust the weights (w) to minimize the estimation error Ek after back
distribution across the network. In these examples, Ek is back distributed to the
previous layers across the network until minimum error is obtained.
There are two parameters that can be used to optimize the ANN at the training step by minimizing the prediction error: the number of neurons in the hidden layer and the learning coeffi cient μ (see example in Chapter 6). The number of neurons in the hidden layer is related to the converging performance of the output error function during the training process of the network. It defi nes the “topol- ogy” of the system. For example, a topology “5, 3, 2” defi nes a system with fi ve nodes in the input layer, three neurons in the hidden layer, and two neurons in the output layer.
Input layer
yi yi ∑ xj yi xk yk
wij wjk
Hidden layer Output layer
∫ ∑ ∫
FIGURE 5.9 Structure of neural network model: (a) a typical multilayer neural network
with one hidden layer, and input and output variables representing schematic of data transfer between neurons. The inputs (y) into a neuron are multiplied by their corresponding connec- tion weights (w) and summed together. This sum is then transformed through a selected func- tion to produce a single output to be passed between neurons in other layers. (Adapted from Torrecilla, J.S., Otera, L., Sanz, P.D., J. Food Eng., 69, 299, 2005.)
Several parameters can be used as performance indices (Torrecilla et al., 2004): (a) initial slope or rate of reduction of initial error in the learning process; (b) fi nal error or error at the end of the learning process; or (c) number of iterations or learning runs needed to end the learning process.
The fi rst optimization instance during training is to determine the topology (or number of nodes in the hidden layer) for a fi xed learning coeffi cient, which provides minimal estimated error with a minimal number of iterations through the system. Once the optimal topology is found, the learning coeffi cient μ can also be opti- mized through a trial-and-error method. In this case, the learning error and iteration number must also be minimized. Once optimization is performed, the values and distribution of output data from the model can be compared with real values using averages, standard deviation, and variance.
5.4.5.2.4 Recall, Generalization, and Validation
After the training step, the network will be subjected to a wide array of input patterns used in training and adjustments introduced to make the system more reliable and robust (recall step). In the generalization step, a set of data from an independent test can be run through the network using the selected topology and the optimized learn- ing coeffi cient with the previously adjusted weights. The validation step evaluates the competence of the trained network. Statistical comparisons as well as correlation coeffi cients can be determined to evaluate the model performance or to validate it with known data sets.
If a high-pressure system is designed such that it is proven uniform in terms of temperature distribution inside the vessel, then ANN can be applied to retrofeed the system controls. This means that the entire volume of temperature data logged during production runs can be used to train the system in parallel. Once the neural network is well trained (i.e., producing outcomes within an acceptable error range), it can be directly applied to autocorrect the high-pressure system if any deviation from the target temperature occurs (e.g., by stopping an under processed run). Furthermore, the ANN will still be capable of learning continuously and thus improve the estimation of the autocorrect function.