Evaluación del ACVF y CCVF de las secciones de firme (IECA)

Capítulo 5. HERRAMIENTAS INFORMÁTICAS DE ACVF

5.2 Análisis de las herramientas seleccionadas

5.2.3. Evaluación del ACVF y CCVF de las secciones de firme (IECA)

As stated above, GA can be defined as population based and algorithmic search heuristic methods that mimics natural evolution process of man [197], [200]. Table 6-1 illustrates the comparative terminology to human genetics [201].

The values of each chromosome are evaluated using a special function, which commonly referred to fitness function or objective function. In other word, the fitness function returns numerical values of each chromosome that are used to rank the chromosomes in the population. Thus, five issues in the GA should be considered, which are encoding the chromosome, population initialization, evaluate the fitness value, selection (genetic operators) and criteria to stop GA as shown in figure 6-6.

In the GA, the chromosomes are a bit strings because the GA works with binary search space. To begin with, the initial population has been created (randomly) and evaluated by using the fitness function. The binary chromosomes have been used in this research, a gene value “1” represents that the particular feature indexed has been selected. Otherwise, the feature should not be selected for chromosome evaluation.

Table 6-1: Comparative terminology to human genetics [201]. SN Human Genetics GA Terminology

1 Chromosomes Bit strings

2 Genes Features

3 Allele Feature value

4 Locus Bit position

5 Genotype Encoded string 6 Phenotype Decoded genotype

Figure 6-5: GA based feature selection.

After using the feature index (“1”), the chromosomes will be ranked and put them in the ranking index, the top fittest kids will be selected to survive the next generation. The fitness evaluation has been done by using the algorithm in figure 6-7 below. After automatically pushing the elite individuals to the next generation, the remaining individuals in the population will passes to the crossover and mutation operations in order to create new individuals. As stated above, crossover is a combination of two chromosomes (individuals) to create new chromosome, while the mutation is used for genetic perturbation of each gene in the chromosome through bits flipping based on the mutation probability as shown in figure 6-5. The configuration of the GA for this

The following steps have been considered for feature selection using genetic algorithm:

A. Initial Population Generation

In this research, the initial population is a matrix of two dimension, which are chromosome length and the population size that are containing only binary digits. The chromosome length (Genomelength) is the bit number of each chromosome, and the population size is the chromosome number in the population size. It has been recommended to make the population size equal to the value of chromosome length in order to span the search space [202]. The Pseudo code for the initial population is:

Table 6-2: GA parameters values. Parameter of GA Value

Population size 100

Genomelength 100 Population type Bitstrings

Fitness function KNN-base classification error Number of generations 300

Crossover Arithmetic crossover

Crossover probability 0.8

Mutation Uniform mutation

Mutation probability 0.1

Selection scheme Tournament of size 2 Elitecount 2

1- Procedure POPFUNCTION()

2- Pop Binary matrix (population size * Genomelength) 3- Return pop

B. Fitness Function

The most important part of the feature selection based on the GA is the fitness function; it has to be defined for evaluating the discriminative capability for each subset of feature. In this work, the fitness function that has been used for evaluating the chromosomes is KNN (K-Nearest Neighbour)-based fitness function. The KNN algorithm has been used for solving the classification problems by looking for the shortest distance between the training and the test data in the feature search space based on the euclidean distance (

,

), as expressed in the equation below:

,

∑

(6-1) The KNN has been able to count each category in the class information (as accumulated as count (xm)) by using 3 Nearest Neighbours, after that it provides a report classification results and classification error based on the expression below:

(6-2) Subject to ∶ ∑ (6-3) The position of “1” is selected for each genes, which indicates the particular feature index. Otherwise, if the genes value “id” is 0, it will not be selected for the chromosome evaluation. Thus, the current population will be evaluated and ranked based on the KNN classification error. In addition to this, the individuals that have the lower fitness value, they have a chance to be survived for the next generation. Meanwhile, the iterations that are run the GA will be part of reducing the error rate by picking up the chromosome with the lowest error rate as the

smallest error rate will be kept or picked up by GA at the end. The expression for the fitness function is explained below:

exp (6-4)

= KNN-based classification error. = number of selected features.

The above expression has been used for learning the GA in order to minimize the error rate and reducing the number of features. Figure 6-7 shows the Pseudo code for the above expression in the GA.

C. Individual Generation for New Population

In this step, the new population has been created by using the genetic operators and elitism (mutation and crossover). In the MATLAB toolbox, GA consists three types of individuals (children): elite children, crossover children and mutation children.

Figure 6-7: GA fitness function based on the KNN. 1- Procedure fit()

2- Featindex (indices of one 1’s from binary chromosome) 3- Newdataset (dataset indexed by Featindex)

4- Numfeat (number of elements in Featindex) 5- 3 (number of neighbours)

6- KNNerror (classifier KNN (dataset, class information, number of neighbours KNN))

7- Return KNNerror 8- End procedure

a) Elite Children

These children have been automatically pushed to the next generation. In the GA MATLAB toolbox, the elitism has been identified as “Elitecount” and the default value is 2 as shown in table 6-2, which is bounded by the population size. Thus, based on the “Elitecount”, GA will pick up the best two chromosomes (the lowest fitness value) and then push them to the next generation. For example, if the number of features are 200, the remaining chromosomes are 198 that they will proceed with the crossover and mutation operators.

b) Crossover Children

This kind of operator sometimes called crossover fraction. In this search, the value of crossover is 0.8, because if it is set to 1, then the mutation operators will not be proceed in GA. Therefore, the value of this operator will be crossover = number of remaining chromosomes*0.8 (198*0.8=158) as stated in the example above.

c) Mutation Operator

The number of mutation operator will be calculated as mutation operator = number of features - elite operator - crossover operator (198- 158-2=38).

d) GA Selection Mechanism

The most important part of GA is the selection mechanism because it selects the best-improved individual’s value in the population. It also

helps the GA to discard the bad individuals and keep the best one. The GA toolbox has many selection mechanisms, one is the stochastic uniform (default size is 4) and the other one is tournament. In this research, the tournament selection mechanism has been used since it is fast, simple and more efficient as stated in [203]–[205]. In addition to this, the tournament selection has been able to enforce the GA to make sure that the worst individual will not go to next generation [201], [206]–[211]. Tournament selection needs two functions to be able to apply in the GA, the first function is individual generation and the second one is picking up the best individual out of the population (the winner). The tournament selection value here is 2, which means that there are two chromosomes should be selected from the population after taken out the elite children. It keeps repeating until filling up the new population.

e) GA Termination

The GA will be stopped if it reaches to the optimal solution, which is called stopping criteria (condition). This research has two stopping conditions:

a) Maximum number of individuals. b) Limit the generation stall.

The GA could be able to terminate the whole process prematurely, if the individuals are not set properly. The value of individuals has been set to 300, while the value of the genomelength has been set to 100. In this case,

related to the genomelength, is equal to or less than 0.000001, the GA will be terminated. Consequently, this will affect the genetic homogeneity between chromosomes, then at the end the GA will produce the best chromosome.

In the previous chapters, all methods that are used in this research have been explained and how they are adapted for IM motor fault classification. The chapter follows moves on to consider the IM test rig setup in order to collect the required data for further processing.

7 CHAPTER

7

EXPERIMENTAL SETUP AND MEASUREMENTS

“This chapter describes all the equipment that has been used in this research to carry out the experimental tests. The test rig and the data acquisition have been also described. Furthermore, the data collection procedure has been explained. It also presents the healthy and faulty signals of the induction motor to be used to detect the IM faults using thermal image as well as current and vibration signals”.

7.1 Introduction

Previous chapters have explained the methods that will be used in this research which are proposed classification algorithm Bee for Mining (B4M), feature extraction, feature selection and the hybrid system of Genetic Algorithm based feature selection for Bee for Mining (GA-B4M). In this chapter, the equipments and experimental setup that have been used for collect the required data (thermal images, current and vibration signals) for classification process (motor protection) have been explained.

A series of experiments have been conducted and the required data have been collected to verify the proposed algorithms for induction motor fault classification. Tests have been carried out under different load conditions with different types of faults. A three- phase squirrel-cage induction motor has been used in this research. “FLIR C2” thermal imaging camera has been used for capturing the motor thermal images, stator current has been collected by using current transformers (one for each phase), and vibration levels has been collected by using laser vibrometer “OFV 303”. A general description of the experiment test rig that are used in this investigation has been explained in the following sections.

In document Análisis de ciclo de vida de carreteras: una visión crítica (página 87-95)