2. Capítulo: Marco conceptual
2.5 Memoria histórica
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 1 2 3 s1 s2 s1 s‘2 s‘1 s‘7 s9 stack snapshots time series s1 amplitude s2 s3 s4 s5 s6 s7 s8s9 s10 s11 s4 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Figure 11.5: Time Series Decomposition Example.
the procedure compute trapezoid(s2,s3). Then we cut the segment s2 at the amplitude value s3.xe =x3 and push the cut segments2 denoted by s02 back
on the stack. We continue with the next segment s4 which is pushed on the stack. Next, we proceed segment s5 by taking s4 from stack, compute the trapezoid T2, then taking s02 from stack in order to compute T3 and finally taking s1 from stack, computeT4, cut s1 w.r.t. x5 and push the cut segment
s05 back on the stack. The algorithm continues with processing segment s6
and so on.
11.4
Parameter Space Indexing
We apply the R∗-tree for the efficient management of the three-dimensional segments, representing the time series objects in the parameter space. As the R∗-tree index can only manage points and rectangles, we represent the three-dimensional segments by rectangles where the segments correspond to one of the diagonals of the rectangles.
For all trapezoids which result from the time series decomposition, the lower bound time interval contains the upper bound time interval. Further- more, intervals which are contained in another interval are located in the
TYPE TSSegment ={start timets, start valuexs, end timete, end valuexe};
decompose(time seriesT S={(xi, ti) :i= 0..tmax}){
/*initialize start and end point of the time series*/
stack.push(TSSegment(t0,⊥, t0, x0)); //left time series border on stack
TS.append((tmax,⊥)); //append right time series border
fori= 1..tmaxdo
next seg := TSSegment(ti−1, xi−1, ti, xi);
if (xi+1< xi), then //segment with positive slope⇒open trapezoid
stack.push(next seg);
else if (xi+1> xi), then //segment with negative slope⇒close trapezoids
while (stack.top.xs ≥next seg.xe) do
stack seg = stack.pop();
compute trapezoid(stack seg,next seg); end while;
stack seg = stack.pop();
compute trapezoid(stack seg,next seg); stack seg = cut segment at(next seg.xe);
stack.push(stack seg);
else /*nothing to do*/; //horizontal segment =¿ can be ignored end if;
end for;
}
TYPE Trapezoid ={bottom start (Time), bottom end (Time), bottom (float), top start (Time), top end (Time), top (float)};
compute trapezoid(TSSegment seg1, TSSegment seg2){
floatτbottom= max(seg1.xs,seg2.xe);
floatτtop= min(seg1.xe,seg2.xs);
Timetbottoms = intersect(seg1,τbottom);
Timetbottom
e = intersect(seg2,τbottom);
Timettops = intersect(seg1,τtop);
Timettop
e = intersect(seg2,τtop);
output(Trapezoid(tbottoms ,tbottome ,τbottom,ttops ,t top e ,τtop)); }
11.4 Parameter Space Indexing 125
lower-right area of this interval representation in the time interval plane. Consequently, the locations of the segments within the rectangles in the pa- rameter space are fixed. Therefore, in the parameter space the bounds of the rectangle which represents a segment suffice to uniquely identify the covered segment. Let ((xl, yl, zl),(xu, yu, zu)) be the coordinates of a rectangle in the
parameter space. Then the coordinates of the corresponding segment are ((xl, yu, zl),(xu, yl, zu)).
Chapter 12
Threshold Based Query
Processing
In this chapter, we present an efficient algorithm for our two threshold queries, the threshold-basedε-range query and the threshold-basedk-nearest- neighbor query. Both consist of a query time seriesQ and a query threshold
τ, as well as the query type specific parameters ε and k (cf. Definition 10.4
and 10.5).
Given the query threshold τ, the first step of the query process is to extract the threshold-crossing time intervals Sτ,Q from the query time series
Q. This can be done by one single scan through the query object Q. Next, we have to find those time series objects X from the database which fulfill the query predicate, i.e. the threshold distance dT S(Sτ,Q, Sτ,X) ≤ ε in case
of the threshold-based ε-range query or the objects belong to the k closest objects fromQw.r.t. dT S(Sτ,Q, Sτ,X) in case of the threshold-basedk-nearest-
neighbor query.
A straightforward approach for the query process would be as follows: first, we access all parameter space segments of the database objects which intersect the time-interval plane at threshold τ by means of the R∗-tree in- dex in order to retrieve the threshold-crossing time intervals of all database objects. Then, for each database object we compute the τ-similarity to the
query object and evaluate the query predicate in order to build the result set. We only have to access the relevant parameter space segments instead of accessing the entire object. But we can process threshold queries in a more efficient way. In particular, for selective queries we do not need to access all parameter space segments of all time series objects covering the threshold amplitude τ. We can achieve a better query performance by using the R∗- tree index to prune the segments of those objects which cannot satisfy the query anymore as early as possible.
12.1
Preliminaries
In the following, we assume that each time series objectX is represented by its threshold-crossing time intervalsSX =Sτ,X =x1, .., xN which correspond
to a set of points lying on the time-interval plane P within the parameter space at query threshold τ. Hence, SX denotes a set of two-dimensional
points1. Furthermore, let D denote the set of all time series objects and S denote the set of all time-interval points on P derived from all threshold- crossing time intervals Sτ,X of all objects X ∈ D.
For our proposal, we need the two basic set operations on single time interval data (represented as points on the time-interval plane P), the ε- range set and thek-nearest-neighbor which are defined as follows:
Definition 12.1 (ε-Range Set).
Let q ∈ P be a time interval, S = {xi : i = 1..N} ⊆ P be a set of N time
intervals and ε ∈ R+
0 be the maximal similarity-distance parameter. Then the ε-range set of q is defined as follows:
Rε,S(q) = {s∈S|dint(s, q)≤ε}.
Definition 12.2 (k-Nearest-Neighbor).
Let q ∈ P be a time interval, S = {si : i = 1..N} ⊆ P be a set of N
1For the description of the threshold-crossing time intervals in the native space (set of