CAPITULO 2. Proyecciones de tráfico y servicios para la provincia
2.2.1 Estado actual de Edusol
2.4.1
Moving Principal Component Analysis (MPCA)
MPCA is the abbreviation of moving principal component analysis (Lanata &
Posenato, 2007; Lanata et al., 2007; Posenato et al., 2008) which is based on the
classical principal component analysis, also known as PCA (Jolliffe, 2011). The description of PCA can be found in the Section 2.3.4.
The algorithm MPCA is designed to figure out the characteristics of a certain time series measurements. This certain period record is named as the initialization phase, in which the structure is supposed to be in healthy condition. After that, anomalous behaviours can be identified according to this initial phase. This certain period is also denominated as window size. The covariance matrix of data inside an active window
is calculated and then moving in time; more details can be found in (Posenato et al.,
2010; Laory et al., 2011). Cavadas et al. (2013) examined the performance of MPCA
and found the MPCA could give an early detection of anomalous behaviours.
With the moving window, the computational cost is lower for each step and detection of the presence of new situations is timelier because old measurements do not buffer results. The window size should be sufficiently large, so that the periodic variability, i.e. the seasonal temperature cycles, can be exposed, while rapidity of computation can be guaranteed at the same time. Therefore, the window size should be theoretically multiple of periodic variability. In the following numerical simulation, the one-year window is chosen in the study considering lower computational cost,
Chapter 2 Literature review
23
instead of two-year window size in (Posenato et al., 2008), because integrated and
continuous measurements can be obtained. After selecting the window size, the first principal component, i.e. the eigenvector related to the main eigenvalues, is analysed at each step. The standard deviation of eigenvectors from the first set of data within the fixed window is recorded as 𝜎, which is subsequently used for threshold definition.
According to previous researches (Posenato et al., 2008, 2010), the confidence interval
is defined as 3𝜎 off the initial data’s eigenvectors.
2.4.2
Robust Regression Analysis (RRA)
Outlier detection has attracted considerable interest in various areas. The core idea of Robust Regression Analysis (RRA) is to investigate the correlation among all sensors during the reference period. The thresholds of confidence intervals are also defined based on this reference period. After that, the focus in the practical phase is on the anomalous behaviours among those correlated pairs of sensors, or in another word the behaviour exceeds the thresholds when compared with the previous state during the reference period.
There is one limitation that has to be highlighted here, the selection of proper threshold parameters will affect the success of robust regression methods. Unfortunately, the proper choice is based on experience (Yuen & Ortiz, 2017).
The performance of RRA has been investigated in several papers (Posenato et al.,
2008; Laory et al., 2011; Cavadas et al., 2013; Dervilis et al., 2015).
2.4.3
One-Class Support Vector Machine (OCSVM)
In contrast to traditional SVMs, one-class SVMs attempt to learn a decision boundary that achieves the maximum separation between the points and the origin.
(Amer et al., 2013)
The one class SVM was first proposed by Schölkopf et al. (Schölkopf et al., 2001).
The general SVM classification can be addressed as the multi-class classification problem. The one-class SVM (OCSVM) can also be viewed as the traditional two- class problem. But the training dataset should only contain normal data set, or with several data points from the other class, but the number of data from the normal dataset
Chapter 2 Literature review
24
is far more than data from another class. Hence, the OCSVM is to find the hyperplane or decision boundary to separate the training data from the origin with the maximum margin.
2.4.4
Artificial Neural Network (ANN)
The artificial neural network approaches are the most common learning machines that used for pattern recognition and novelty detection (Markou & Singh, 2003; Hernandez-Garcia & Sanchez-Silva, 2007). Because in the engineering discipline, the ANN model can learn a latent relation between excitation signal (i.e. input) and structural response (output data), even the data are fuzzy or incomplete (Nazarko & Ziemiański, 2011). There are several popular network types that used in damage
detection, which are the Multi-Layer Perceptron (MLP) (Moya et al., 1993), self-
organising maps (Kohonen, 1998), Radial Basis Function networks (RBF) (Lange et
al., 1997), Hopfield networks (Chandola et al., 2009) and oscillatory networks (Tuong
Vinh Ho & Rouat, 1998). The review of the above networks can be found in (Markou
& Singh, 2003; Pimentel et al., 2014).
The transfer function, also known as activation function, is utilized to transfer or
map the activating nodes into an output signal (Sibi et al., 2013), as shown in Figure
2.3. The ‘weighted sum’ of the inputs with initially settled weight coefficient and bias is computed are transferred to the activation function. The transfer function is inside the hidden layers and is defined by the ANN architecture, which is also utilized to define the confidence interval (Hernandez-Garcia & Sanchez-Silva, 2007). The proper selection of the activation function will certainly affect the accuracy and performance
of the neural network (Chiba et al., 2018) and also have an evident effect on the
convergence of BP learning algorithms (Chandra & Singh, 2004). According to Sibi
et al. (2013) and Chiba et al. (2018), there are several common activation functions as follows: linear (identify function); sigmoid (logistic function); binary step; sigmoid symmetric; sigmoid stepwise; Gaussian symmetric etc. The Back Propagation (BP) network, a multi-layer perceptron network, is the most common model that has been widely used (Huang, 2010; Nazarko & Ziemiański, 2011; Nazarko & Ziemianski, 2016). The feed-forward BP network is adopted in this research.
Chapter 2 Literature review 25 Input Output Transfer function Weight
Figure 2.3. The transfer function in the neural network