The Sardic unconformity and the Upper Ordovician successions of the Ribes de Freser area, Eastern Pyrenees
4 Facies and sedimentary features of the El Baell unit
Suppose that we have two time series A = (a1, a2, . . . , an) and B = (b1, b2, . . . , bm). ForA let Head(A) = (a1, a2, . . . , an−1). Similarly, forB.
Definition 2. The LCSS betweenA and B is defined as follows:
LCSS(A, B) =
The above definition is recursive and would require exponential time to compute. However, there is a better solution that can be offered inO(m∗n) time, using dynamic programming.
Dynamic Programming Solution [11,42]
The LCSS problem can easily be solved in quadratic time and space. The basic idea behind this solution lies in the fact that the problem of the sequence matching can be dissected in smaller problems, which can be com-bined after they are solved optimally. So, what we have to do is, solve a smaller instance of the problem (with fewer points) and then continue by adding new points to our sequence and modify accordingly the LCSS.
Now the solution can be found by solving the following equation using dynamic programming (Figure 4):
where LCSS[i, j] denotes the longest common subsequence between the first i elements of sequence A and the first j elements of sequence B. Finally, LCSS[n, m] will give us the length of the longest common subsequence between the two sequencesA and B.
The same dynamic programming technique can be employed in order to find the Warping Distance between two sequences.
Fig. 4. Solving the LCSS problem using dynamic programming. The gray area indicates the elements that are examined if we confine our search window. The solution provided is still the same.
3.2. Extending the LCSS Model
Having seen that there exists an efficient way to compute the LCSS between two sequences, we extend this notion in order to define a new, more flexible, similarity measure. The LCSS model matches exact values, however in our model we want to allow more flexible matching between two sequences, when the values are within certain range. Moreover, in certain applications, the stretching that is being provided by the LCSS algorithm needs only to be within a certain range, too.
We assume that the measurements of the time-series are at fixed and discrete time intervals. If this is not the case then we can use interpolation [23,34].
Definition 3. Given an integerδ and a real positive number ε, we define theLCSSδ,ε(A, B) as follows:
LCSSδ,ε(A, B) =
0 if A or B is empty
1 +LCSSδ,ε(Head(A), Head(B)) if|an− bn| < ε and |n − m| ≤ δ
max(LCSSδ,ε(Head(A), B), LCSSδ,ε(A, Head(B))) otherwise
Fig. 5. The notion of the LCSS matching within a region of δ & ε for a sequence. The points of the two sequences within the gray region can be matched by the extended LCSS function.
The constant δ controls how far in time we can go in order to match a given point from one sequence to a point in another sequence. The constant ε is the matching threshold (see Figure 5).
The first similarity function is based on the LCSS and the idea is to allow time stretching. Then, objects that are close in space at different time instants can be matched if the time instants are also close.
Definition 4. We define the similarity functionS1 between two sequences A and B, given δ and ε, as follows:
S1(δ, ε, A, B) = LCSSδ,ε(A, B) min(n.m)
Essentially, using this measure if there is a matching point within the regionε we increase the LCSS by one.
We use functionS1 to define another, more flexible, similarity measure.
First, we consider the set of translations. A translation simply causes a vertical shift either up or down. LetF be the family of translations. Then a functionfc belongs toF if fc(A) = (ax,1+c, . . . , ax,n+c). Next, we define a second notion of the similarity based on the above family of functions.
Definition 5. Givenδ, ε and the family F of translations, we define the similarity functionS2 between two sequences A and B, as follows:
S2(δ, ε, A, B) = max
fc∈FS1(δ, ε, A, fc(B))
Fig. 6. Translation of sequence A.
So the similarity functions S1 and S2 range from 0 to 1. Therefore we can define the distance function between two sequences as follows:
Definition 6. Givenδ, ε and two sequences A and B we define the following distance functions:
D1(δ, ε, A, B) = 1 − S1(δ, ε, A, B) and
D2(δ, ε, A, B) = 1 − S2(δ, ε, A, B)
Note that D1 and D2 are symmetric. LCSSδ,ε(A, B) is equal to LCSSδ,ε(B, A) and the transformation that we use in D2 is translation which preserves the symmetric property.
By allowing translations, we can detect similarities between movements that are parallel, but not identical. In addition, the LCSS model allows stretching and displacement in time, so we can detect similarities in move-ments that happen with different speeds, or at different times. In Figure 6 we show an example where a sequenceA matches another sequence B after a translation is applied.
The similarity function S2 is a significant improvement over the S1, because: (i) now we can detect parallel movements, (ii) the use of normal-ization does not guarantee that we will get the best match between two time-series. Usually, because of the significant amount of noise, the average value and/or the standard deviation of the time-series that are being used in the normalization process can be distorted leading to improper translations.
3.3. Differences between DTW and LCSS
Time Warping and the LCSS share many similarities. Here, we argue that the LCSS is a better similarity function for correctly identifying noisy
sequences and the reasons are:
1. Taking under consideration that a large portion of the sequences may be just outliers, we need a similarity function that will be robust under noisy conditions and will not match the incorrect parts. This property of the LCSS is depicted in the Figure 7. Time Warping by matching all elements is also going to try and match the outliers which, most likely, is going to distort the real distance between the examined sequences.
In Figure 8 we can see an example of a hierarchical clustering pro-duced by the DTW and the LCSS distances between four time-series.
Fig. 7. Using the LCSS we only match the similar portions, avoiding the outliers.
Fig. 8. Hierarchical clustering of time series with significant amount of outliers. Left:
The presence of many outliers in the beginning and the end of the sequences leads to incorrect clustering. DTW is not robust under noisy conditions. Right: The LCSS focusing on the common parts achieves the correct clustering.
Fig. 9. Left: Two sequences and their mean values. Right: After normalization. Obvi-ously an even better matching can be found for the two sequences.
The sequences represent data collected through a video tracking process (see Section 6). The DTW fails to distinguish the two classes of words, due to the great amount of outliers, especially in the beginning and in the end of the sequences. Using the Euclidean distance we obtain even worse results.
Using the LCSS similarity measure we can obtain the most intuitive clus-tering as shown in the same figure. Even though the ending portion of the Boston 2 time-series differs significantly from the Boston 1 sequence, the LCSS correctly focuses on the start of the sequence, therefore producing the correct grouping of the four time-series.
2. Simply normalizing the time-series (by subtracting the average value) does not guarantee that we will achieve the best match (Figure 9). How-ever, we are going to show in the following section, that we can try a set of translations which will provably give us the optimal matching (or close to optimal, within some user defined error bound).
4. Efficient Algorithms to Compute the Similarity