• No se han encontrado resultados

Administración Gubernamental de Ingresos Públicos

7.5

Preliminary Feature Selection

As features are collected via simulation, they are limited to phase angle, active and reactive power, frequency, voltage and current on each bus. These features are also most widely used to describe power systems. More importantly, it is most likely that the same set of features is stored in data logs that may be used to train predictors. Nevertheless, it is important to point out that other features may also be collected in a real system and they might be more indicative for prediction of sags or other disturbances. These include the rate of change of frequency as well as raw waveforms sampled with a PMU. Moreover, additional features may be derived as, for example, difference between two consecutive raw waveform samples or peaks of voltage samples.

As frequency is regulated at the transmission level and does not depend on events in the distribution grid, it has been excluded from the list of features. Current values were also not considered, as current may be derived from voltage and power. Preliminary feature selection is further performed for each of the selected algorithms separately. For this purpose, we took five most commonly known and widely used machine-learning classification algorithms.

A brief description of the selected algorithms is given in Table 7.6. Time com- plexity for training the algorithms using Big O notation is included in the same table (m stands for the number of instances and n for the number of features). Versions of the algorithms that are implemented in Weka version 3.8 have been considered when estimating complexity. More details on these and other algo- rithms may be found in[154].

The use of filtering methods for the preliminary selection was not practical as features are placed in separate tables, and finding correlation between indi- vidual features would require more effort (and no better results) than using a wrapper method for each algorithm. Also, it is important to observe that, from the algorithm’s point of view, there are more than 56 features (14 buses with 4 features per bus). In this view, a feature is a combination of a system parameter (voltage, phase angle, etc.) and a time before the event.

Performance of each algorithm is evaluated with F-measure using ten-fold cross validation on every combination of lead time, prediction window size and sampling period for each of 56 features. Then, an average F-measure for every combination is set as a feature rank.

As an example, feature ranks when logistic regression algorithm is used for predicting sags on Bus 6 for the case when a fault occurs on Bus 2, are given in Table 7.7. It should be noted that a feature is uniquely identified with a bus number (1 to 14) and a feature type (V, P, S, T).

109 7.5 Preliminary Feature Selection

Table 7.6. Brief description of used machine-learning algorithms.

Name Description Training-time

Complexity

Naïve Bayes

Uses Bayes’ theorem to calculate a probability that an instance belongs to a class, assuming that features are fully independent. A threshold (typically 0.5) is applied to make the classification decision.

O(m)[166]

Logistic Regression

Creates a function where weights are associated with features. The function is used to calculate a probability that an instance belongs to a class. A threshold (typically 0.5) is applied to make the classification decision.

O(mn2)[167]

SVM

(Support Vector Machine) Creates an optimal hyperplane in the feature space to split between two classes. See[168] for more details.

O(m3)[169]) IBk (k-NN) Uses k nearest neighbors in the feature

space to classify a new instance. O(mn)[170] J48 (C4.5)

Creates a decision tree using information entropy. See[171] for more details.

O(mn2)[172]

Following the ranking procedure, ten most indicative features for every algo- rithm are identified. In Tables 7.8a and 7.8b we present results for the case of sag detection on Bus 6 when faults are injected on all other buses. The case of injecting a fault and predicting a sag on the same bus has not been considered as, in this case, a fault has an immediate affect and no prediction is possible. Features are represented by a feature type symbol as in Table 7.7 (V - Voltage, P - Active power, S - Reactive power, and T - Phase angle) followed by the bus num- ber. They are ranked with integers from one to ten, with one being the highest rank.

It may be observed that features that are closer to the fault-injection bus have higher rank. In fact, following the results presented in the two tables, we may derive more specific rules:

110 7.5 Preliminary Feature Selection

Table 7.7. Feature ranking with F-measure for Logistic regression for the case of sag prediction on Bus 6 when faults are injected on Bus 2.

Bus Voltage (V) Active power (P) Reactive power (S) Phase angle (T)

1 0.636 0.819 0.736 0.808 2 0.663 0.702 0.826 0.824 3 0.675 0.787 0.690 0.808 4 0.683 0.799 0.800 0.784 5 0.700 0.803 0.803 0.798 6 0.724 0.725 0.757 0.800 7 0.702 0.518 0.517 0.791 8 0.656 0.626 0.730 0.790 9 0.715 0.806 0.806 0.793 10 0.727 0.806 0.806 0.795 11 0.738 0.807 0.807 0.798 12 0.741 0.806 0.806 0.800 13 0.753 0.806 0.806 0.801 14 0.740 0.805 0.805 0.797

• Phase angle, active and reactive power values on the bus where the fault is injected are typically among four most indicative features;

• Phase angles on buses with large loads that are connected or close to the fault-injection bus, in most of the cases, among the five most indicative features;

• Active powers and phase angles of the buses with renewable generators (windmills in particular) are among the most indicative features if the bus is electrically close to the one where a fault has been injected;

• Active and reactive power values, as well as phase angles of buses that are electrically close to the bus where a sag is being predicted are among ten most indicative features.

In addition, we may observe that for IBkand J48 algorithms the most indica-

tive features mostly include phase angles.

Also, it is worth pointing out that, when the objective is to create a general sag predictor on a selected bus that does not take into account position of the fault, then, active, reactive power values, and phase angles of the buses that are electrically close to the bus where a sag is being predicted should be used.

111 7.5 Preliminary Feature Selection

However, the prediction performance in this case will not be the maximum one as these features typically have rank above six.

Table 7.8a. The most indicative features for different machine-learning algo- rithms when sags are predicted on Bus 6 for fault injected on Buses 1 to 7.

112 7.5 Preliminary Feature Selection

Table 7.8b. The most indicative features for different machine-learning algo- rithms when sags are predicted on Bus 6 for fault injected on Buses 8 to 14.