• No se han encontrado resultados

Capítulo 3: Descriptores de puestos de trabajo aplicando los perfiles por competencia a

12. ANEXOS

Within the knowledge discovery process (Section 3.1) the feature-based group extraction approach is an essential component to access underlying effects such as movement characteristics that can not be found manually and to find groups of possible interest within large-scale trajectory databases. Representing a tra- jectory by a feature (Section 3.3.3) is an enormous reduction of the dimensional- ity facilitating the application of feature-based group extraction methods such as clustering and filtering (Figure 3.21). The exemplary application of a fuzzy c- means clustering method comprising four clusters is depicted in Figure 3.21A, whereas a filtering approach in Figure 3.21B is used to easily segregate two op- posing sides within the benchmark database TDB S3. Furthermore, the existing prior knowledge (Section 3.2) can be incorporated in these computational meth- ods by exemplary setting the number of clusters or by specifying relevant filter intervals to efficiently extract groups of potential interest. Once the groups are extracted, interactive modifications can be applied elaborating the results by slight refinements of the allocated information (Section 4.2). The computational feature-based group extraction methods, therefore, allow to easily fathom huge trajectory databases in case of existing prior knowledge enabling the effective investigation of emerging effects (Section 3.2.2).

For the clustering approach in trajectory data, time series clustering using the time series representing the spatial location in X-, Y- and Z-direction is also possible. However, the time series clustering approach is inappropriate for tra- jectories exhibiting a non error-free attitude leading to the non-applicability to trajectory problem classes (Section 2.1) comprising fragmented tracking data. Even, if the trajectories are spanning the whole time interval without fragmen- tation artifacts equal to problem classes of lower complexity (Section 2.1), time series clustering is not suitable for all group extraction approaches. Exemplary, Figure 3.22A illustrates two groups of trajectories out of benchmark TDB S4 (Section 2.4.2): One group (green) that moves straight and the other group (ma- genta) displaying a turn-around movement. The result of time series clustering

A1 1400 Track number B1 B2 C2 A P A P 200 600 1000 -600 -400 -200 0 200 400 E n d P o in t X Y X -X -Y -500 0 500 End Point X -500 0 500 1000 E n d P o in t Y A P A2 Y X -X -Y

Figure 3.21: Feature-based group extraction methods. (A) Clustering the X- and Y- position of the end points comprising four clusters (A2) with the result applied to the 3D trajectory data (A1). (B) Filtering approach on the basis of the end point in X-direction easily segregating two opposite sides (B1) and the result applied to the 3D trajectory data (B2). Here, benchmark TDB S2 is used for validation purpose.

using the X-position, Y-position and Z-position time series is depicted in Figure 3.22B, separating two symmetrical groups instead of the feature distinguishable groups (Figure 3.22A). In conclusion, time series clustering of trajectory data is not possible in presence of fragmentation and higher levels of complexity (Sec- tion 2.1). Furthermore, not all characteristics can be separated using time series clustering. In these cases, extracted features for each trajectory (Section 3.3.3) can be used to access the underlying characteristics even in presence of frag- mentation artifacts. A B Y X -X -Y

Figure 3.22: Time series clustering on trajectory data using benchmark databases TDB S4. (A) Ground truth groups. Here, the green group is moving straight, whereas the magenta group performs a turn-around movement. (B) Result of time series cluster- ing using the spatial coordinate time series in X-, Y- and Z-direction.

In feature-based group extraction approaches the incorporation of prior knowl- edge as described in Section 3.2, as well as the type of analysis approach, is highly relevant for the resulting quality and the required time effort. There are one-step approaches (OS) accessing the extracted information on a single anal- ysis step, globally applied to the whole data, as well as multi-step approaches (MS) using a hierarchical analysis strategy allowing the use of several con- structing methodological approaches. The resulting time effort and quality are depicted in Figure 3.23. Here, four different benchmark databases belonging to highly complex problem classes introduced in Section 2.1, namely benchmark TDB S8, TDB S9, TDB S10 and benchmark TDB S11 are used for validation purpose. The complexity of the benchmark databases increases with the in- creasing number of the benchmark pointing out that benchmark TDB S11 is the most complex one and benchmark TDB S9 the one with the lowest complexity in comparison to all others. The quality is assessed using the accuracy describ- ing the percentage of correctly assigned trajectories to the ground truth groups. It can be observed that fully automatic one-step approaches require a negligi- ble effort in time compared to the manual incorporation of prior knowledge. However, the achieved accuracy of the automated OS approaches is much be- low the approaches incorporating prior knowledge (Figure 3.23). Furthermore, it can be observed that the higher the problem class, the lower the accuracy. Even in the presence of benchmarks with high complexity (Section 2.4.2) used for this validation, multi-step approaches incorporating prior knowledge lead to perfect group extractions with a 100% accuracy, but with the drawback of an increase in time effort (see Table C.9 for detailed numbers). Moreover, the total time effort in case of existing prior knowledge is at an acceptable level not exceeding 100 seconds. Even in presence of no prior knowledge the time effort is only maximal five times higher, not exceeding 550 seconds. The analy- sis was performed by an experienced user of the framework suggesting that a non-experienced user may need the twice or triple amount of time being in the range of approximately 3 to 5 minutes in total in case of existing prior knowl- edge, which is also very fast. In the cases without prior knowledge the multi- step approaches also lead to a 100% accuracy accompanying a much higher time effort (Figure 3.23). An increase within the complexity of the benchmark lead to a disproportionate rise of the required time effort. Concluding, even highly complex trajectory datasets can be investigated and relevant groups can

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Accuracy 0 100 200 300 400 500 600 Time Effort [ s ] TDB_S8-OS_Autom TDB_S8-OS_Prior TDB_S8-MS_Prior TDB_S8-MS_No-Prior TDB_S9-OS_Autom TDB_S9-OS_Prior TDB_S9-MS_Prior TDB_S9-MS_No-Prior TDB_S10-OS_Autom TDB_S10-OS_Prior TDB_S10-MS_Prior TDB_S10-MS_No-Prior TDB_S11-OS_Autom TDB_S11-OS_Prior TDB_S11-MS_Prior TDB_S11-MS_No-Prior

Figure 3.23:The impact of one-step and multi-step analysis approaches combined with the integration of prior knowledge. For four different benchmark databases TDB S8, TDB S9, TDB S10 and TDB S11 a fully automatic approach (circles), a one-step ap- proach using prior knowledge (stars), a multi-step approach using prior knowledge (rectangles) and a multi-step approach without the existence of prior knowledge (trian- gles) are validated using the required time effort and the accuracy of group assignments. In Table C.9 the corresponding numbers of time effort and accuracy are listed.

be extracted with a 100% accuracy. The time effort varies a lot depending on the extent of existing prior knowledge and the complexity of the underlying database (Figure 3.23).