• No se han encontrado resultados

Los distintos ámbitos: internacional, estatal, autonómico

In document MONOGRAFÍAS DE EDUCACIÓN AMBIENTAL (página 121-124)

Marina Di Masso

1. Los distintos ámbitos: internacional, estatal, autonómico

This section demonstrates the operation of the prediction framework while using the proposed PS representation. As indicated previously in Chapter 3, the RASP frame- work comprises three stages. The first stage is the preprocessing stage where the grid representation and error (springback) calculations are performed for a given shape. The preprocessing stage results in the Gin. Recall that each grid centre point in Gin is as-

sociated with an error (springback) value. Gin is then used as the input to next stage,

the surface representation stage, where any of the proposed 3D surface representation technique can be applied (in this case, the PS representation is used). The third stage is classifier generation and testing. In the case of the PS representation, and unlike the previous representations considered, the PS representation operates using thek-NN supervised learning mechanisms. More specifically k-NN classification is used, with k is set to 1. The classification process of a new curve, cnew, using k-NN classification is

conducted according to a similarity measure. Different approaches have been proposed to achieve this. The most popular similarity measurement is Euclidean distance1:

Given two sequences A= [a1, a2, . . . , an] and B = [b1, b2, . . . , bn] of the same length

|A|=|B|=nthen the similarity value (S(A, B)) betweenAand B is defined using the standard Euclidean distance measure (D(A, B)) as follows.

S(A, B) =D(A, B) = v u u t n X i=1 (ai−bi)2

However, with respect to the work described in this thesis, a distance was found to be insufficient with respect to the proposed PS representation. Note also that, although not applicable in the context of the PS representation, distance based similarity measures are not suited to sequences of different length [184, 224, 225]. Hence Dynamic Time Warping (DTW) was adopted (introduced previously in Chapter 2). The rest of this section is organised as follows. The calculation of the DTW measure is presented in Section 6.3.1 while Section 6.3.2 describes the operation of thek-NN classification.

6.3.1 Dynamic Time Warping Similarity Measurement

This section presents the Dynamic Time Warping (DTW) algorithm together with a de- tailed example. Recall that the background to DTW was described previously in Chapter 2. Algorithm 6.1 presents the DTW algorithm. It should be noted that Algorithm 6.1 makes use of the Sakoe-Chiba (S-C) Band [181], a “windowing” mechanism also de- scribed earlier in Chapter 2. The Sakoe-Chiba (S-C) band window was employed with respect to the work described in this thesis to: (i) eliminate the implications of the singularity problem, (ii) improve the complexity to O(n×w) [108, 147, 184] and (iii) maintain the desired optimal path near the diagonal. The value ofw= 10% of the series length was used as suggested in [156] and as shown in Algorithm 6.1 (line 8).

The algorithm operates as follows. The input is a new curve to be labelled cnew and

a set of labelled curves C ={c1, c2,· · · , cn}. Suppose that cnew = [a0, a1,· · · , ap] and

cl= [b0, b1,· · ·, bq] wherecl∈C. The first step is to generate a 2D matrixMof sizeA×B

(line 7). Each matrix entry M(i, j) then holds the “cost” between the corresponding points ai and bi where the cost is defined in terms of the Euclidean distance between

the two pointsc(ai, bj) =D(ai−bi) =

p

(ai−bi)2=|ai−bi|.

Figure 6.2 illustrates the operation of DTW for a given cnew and cl where a matrix

M is generated to identify the DTW path and the DTW value. For simplicity, the points of the curve are all integer values such that cnew = [1,2,4,5,7,6,6,5,8,3,4,7]

and cl = [2,2,3,6,7,7,8,5,4,3,6,5]. Referring to Algorithm 6.1, and the example in

Figure 6.2, the operation of DTW can be described as follows.

1. The inputs are the new unlabelled curvecnew= [a0, a1,· · ·, ap] (|cnew|=p+1 =A)

and the set of curves C = {c1, c2,· · ·, cn} where cl = [b0, b1,· · · , bq] (|cl| = B =

q+ 1).

Figure 6.2: An example of the operation of DTW using two equal sized curvesc1and

c2. For illustrative purposes a window size ofw= 3 was used (as shown in shaded area).

The indices of the lower and upper boundary are coloured in green. The optimal DTW

path is indicated using dark shading with red text. The path commences at M(0,0)

and ends atM(11,11). The DTW value is located inM(11,11) and is equivalent to 13

in this case.

3. The first elementM(0,0) is calculated using the standard euclidean distance (line 9 in the algorithm). With reference to the example in Figure 6.2:

M(0,0) =|1−2|

= 1

4. The first w elements of the first row of M are calculated by adding the cost of the corresponding elements of cl and cnew recursively to the cost of the previous

element as shown in line 12. With reference to the example in Figure 6.2:

M(0,2) =|4−2|+M(0,1) = 2 + 1

Algorithm 6.1:Dynamic Time Warping (DTW)

Input: New unlabelled curve (cnew), a set of curves (C)

Output: The set of similarity values.

1 C ={c1, c2,· · ·, cn} ; 2 cnew ←[a0, a1,· · · , ap] ; 3 for allcl∈ C do 4 cl←[b0, b1,· · · , bq]; 5 A← p+ 1; 6 B← q+ 1; 7 M ←new[A×B] ; // Initialise M 8 w← d0.10×Ae ; 9 M(0,0)← |cnew(0)−cl(0)|; 10 i←1;

11 whilei≤wdo // Calculate the 1st row of M (M(i,0))

12 M(i,0)← |cnew[i]−cl(0)|+M(i−1,0) ;

13 i+ +;

14 end

15 j←1 ;

16 whilej≤wdo // Calculate the 1st column of M (M(0, j))

17 M(0, j)← |cnew(0)−cl(j)|+M(0, j−1) ;

18 j+ +;

19 end

20 row ← A;

21 column ← B;

22 fori←1 to row do // Calculate the rest of M

23 forj←1 to columndo 24 if |i−j| ≤w then 25 M(i, j)← |cnew(i)−cl(j)|+M in(M, i, j) ; // Algorithm 6.2 26 end 27 end 28 end

29 cl.dtw←M(A, B) ; // The dtw is the similarity value for cl

30 end

31 return C0 ; // C associated with similarity values

5. The same calculations described in step 4 are performed for the first w elements of the first column of M as shown in line 17. With reference to the example in Figure 6.2:

M(3,0) =|6−1|+M(2,0) = 5 + 4

6. The rest of the elements of M are calculated as follows. ForM(i, j) the distance between the corresponding points along both curves cnew and cl is calculated and

then the value is added to the minimum cost of the adjacent cells M(i−1, j), M(i, j−1) and M(i−1, j−1). Algorithm 6.2 is used to identify the minimum cost within the S-C band (as shown in line 25 of Algorithm 6.1). The value of each element in M is used to calculate the minimal warping path. To give an example and with reference to the example in Figure 6.2:

M(8,7) =|4−5|+M in(M,8,7)

= 1 +minimum value of {M(7,7), M(7,6), M(8,6)}

= 1 +M(7,7) = 1 + 6 = 7

Note that Algorithm 6.2 distinguishes the position of the elements, whether they are located on the upper boundary or the lower boundary based on the i and j values (indices are coloured in green in Figure 6.2). If the element is located on the lower boundary (i < j) (as shown in line 2 of Algorithm 6.2) then the minimum value would be selected from two elements M(i, j −1) and M(i−1, j −1) as M(i−1, j) is out of the S-C band. With reference to the example in Figure 6.2:

M(6,9) =|8−3|+M in(M,6,9)

= 5 +minimum value of {M(6,8), M(5,8)}

= 5 +M(6,8) = 5 + 7 = 12

However, if the element is located on the upper boundary (i > j) (line 8 of Algo- rithm 6.2) then the minimum value would be selected from two elementsM(i−1, j)

and M(i−1, j−1) asM(i, j−1) is outside of the S-C band. For example: M(7,4) =|5−7|+M in(M,7,3) = 1 +minimum value of {M(6,3), M(6,4)} = 2 +M(6,4) = 2 + 4 = 6

7. When the matrix M has finally been generated, the minimal DTW path, and associated similarity value can be identified. In our example, the dark shaded area, with red text, indicates the DTW path (the warping path) and the similarity value of 13 is located inM(11,11).

Algorithm 6.2:Min Algorithm Input: Matrix M,i,j

Output: min value

1 w← d0.10×M.lengthe;

2 if (|i−j|=w) & (i < j)then // Elements on the lower boundary 3 if M(i, j−1)< M(i−1, j−1)then

4 min←M(i, j−1)

5 else

6 min←M(i−1, j−1)

7 end

8 else if (|i−j|=w) & (i > j) then // Elements on the upper boundary 9 if M(i−1, j)< M(i−1, j−1)then

10 min←M(i−1, j)

11 else

12 min←M(i−1, j−1)

13 end

14 else // Elements within boundaries

15 min←M(i−1, j) ; 16 if min > M(i−1, j−1)then 17 min←M(i−1, j−1) ; 18 end 19 if min > M(i, j−1)then 20 min←M(i, j−1) ; 21 end 22 end 23 return min;

6.3.2 k-NN Classification

This section presents the operation of the k-NN classification algorithm with k = 1 as adopted with respect to PS representation technique described in this thesis. The combination of 1-NN with the DTW similarity measure has been shown to outperform other techniques with respect to time series classification [52, 217]. The curveclwith the

minimum DTW would be the most similar curve tocnew. If there is a clear “winner” the

label from this winner is used to label cnew. However, if more than onecl has the same

lowest DTW value then there are two alternatives to define the label ofcnew, either: (i)

some kind of “voting scheme” may be used where the majority label is considered to be the winner label, or (ii) the average value for the labels may be calculated and used to define the label forcnew. The first is applicable to the discretised PS representation

while the second is applicable to the real PS representation.

Once the DTW value has been calculated, all cl ∈C are sorted in ascending order

according to their DTW values as shown in Algorithm 6.3 (line 1). ThenS is initialised as the set of similarity values (line 2 of Algorithm 6.3). The first element ofSis assigned to the lowest DTW value as shown in line 3 of Algorithm 6.3. Thus, if S ={cl}, i.e. S

contains just one curve ci, the cnew is given the label of ci (line 10 of Algorithm 6.3).

Otherwise, all repeated DTW values will be located inS (line 6 of Algorithm 6.3) until the first different DTW value arise. The label ofcnew will be the average of allcl labels

with the same DTW value located inS as shown in line 16 of Algorithm 6.3. Algorithm 6.3:k-NN

Input: new unlabelled cnew, set of curvesC

Output: labelled cnew

1 C00← sortedC0 according to DTW values in ascending order ;

2 Initialise S[ ]←φ ; // Array of the similarity values

3 S[0]← C0[0].dtw ; 4 i←1; 5 while C00.dtw=S[0] and i≤ |C00|do 6 S[i]← C00[i].dtw; 7 i+ +; 8 end 9 if |S|== 1 then

10 cnew.label← C00[0].label;

11 else

12 sum← C00[0].label;

13 fori←1 to |S|do

14 sum← sum+C00[i].label;

15 end

16 cnew.label←sum/|S|;

17 end

In document MONOGRAFÍAS DE EDUCACIÓN AMBIENTAL (página 121-124)