SARDINIA. GEOLOGICAL SETTING AND STRATIGRAPHY
5.2 Post-unconformity succession
The main results include a procedure for compressing time series, indexing of compressed series by their prominent features, and retrieval of series
0 0 0
0 0 0
200 151 82
202
395 392
perfect ranking perfect ranking perfect ranking
fast ranking
perfect ranking perfect ranking perfect ranking
fast ranking (a) Search for twelve-leg patterns of air and sea temperatures.
(b) Search for nine-leg patterns of wind speeds.
(c) Search for electroencephalogram patterns with twenty legs.
0 200 0 89 0 35
Fig. 15. Retrieval of weather and electroencephalogram patterns. The horizontal axes show the similarity ranks assigned by the fast algorithm, and the vertical axes are the exhaustive-search ranks.
whose compressed representation is similar to the compressed pattern.
The experiments have shown the effectiveness of this technique for index-ing of stock prices, weather data, and electroencephalograms. We plan to apply it to other time-series domains and study the factors that affect its effectiveness.
We are working on an extended version of the compression procedure, which will assign different importance levels to the extrema of time series,
leg. The retrieval algorithm identifies these candidates and then compares them with the pattern. The number of candidates depends on the search parameters C and D.
Search Stock Air and sea Wind speeds EEG
parameters prices temperatures
C = D = 1.5 270 1,300 970 40
C = D = 2 440 2,590 1,680 70
C = D = 5 1,090 11,230 2,510 220
and allow construction of a hierarchical indexing structure [Gandhi (2003)].
We also aim to extend the developed technique for finding patterns that are stretched over time, and apply it to identifying periodic patterns, such as weather cycles.
Acknowledgements
We are grateful to Mark Last, Eamonn Keogh, Dmitry Goldgof, and Rafael Perez for their valuable comments and suggestions. We also thank Savvas Nikiforou for his comments and help with software installations.
References
1. Aggarwal, C.C. and Yu, Ph.S. (2000). The IGrid Index: Reversing the Dimen-sionality Curse for Similarity Indexing in High-Dimensional Space. Proceed-ings of the Sixth ACM International Conference on Knowledge Discovery and Data Mining, pp. 119–129.
2. Agrawal, R., Psaila, G., Wimmers, E.L., and Zait, M. (1995). Querying Shapes of Histories. Proceedings of the Twenty-First International Confer-ence on Very Large Data Bases, pp. 502–514.
3. Agrawal, R., Mehta, M., Shafer, J.C., Srikant, R., Arning, A., and Bollinger, T. (1996). The Quest Data Mining System. Proceedings of the Second ACM International Conference on Knowledge Discovery and Data Mining, pp.
244–249.
4. Andr´e-J¨onsson, H. and Badal, D.Z. (1997). Using Signature Files for Query-ing Time-Series Data. ProceedQuery-ings of the First European Symposium on Prin-ciples of Data Mining and Knowledge Discovery, pp. 211–220.
5. Bollobas, B., Das, G., Gunopulos, D., and Mannila, H. (1977). Time-Series Similarity Problems and Well-Separated Geometric Sets. Proceedings of the Thirteenth Annual Symposium on Computational Geometry, pp. 454–456.
6. Bozkaya, T. and ¨Ozsoyoglu, Z.M. (1999). Indexing Large Metric Spaces for Similarity Search Queries. ACM Transactions on Database Systems 24(3), 361–404.
7. Bozkaya, T., Yazdani, N., and ¨Ozsoyoglu, Z.M. (1997). Matching and Index-ing Sequences of Different Lengths. ProceedIndex-ings of the Sixth International Conference on Information and Knowledge Management, pp. 128–135.
8. Brockwell, P.J. and Davis, R.A. (1996). Introduction to Time Series and Forecasting. Springer-Verlag, Berlin, Germany.
9. Burrus, C.S., Gopinath, R.A., and Guo, H. (1997). Introduction to Wavelets and Wavelet Transforms: A Primer. Prentice Hall, Englewood Cliffs, NJ.
10. Caraca-Valente, J.P. and Lopez-Chavarrias, I. (2000). Discovering Similar Patterns in Time Series. Proceedings of the Sixth ACM International Con-ference on Knowledge Discovery and Data Mining, pp. 497–505.
11. Chan, K.-P. and Fu, A.W.-C. (1999). Efficient Time Series Matching by Wavelets. Proceedings of the Fifteenth International Conference on Data Engineering, pp. 126–133.
12. Chan, K.-P., Fu, A.W.-C., and Yu, C. (2003). Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping. IEEE Transactions on Knowledge and Data Engineering, Forthcoming.
13. Chi, E.H.-H., Barry, Ph., Shoop, E., Carlis, J.V., Retzel, E., and Riedl, J.
(1995). Visualization of Biological Sequence Similarity Search Results. Pro-ceedings of the IEEE Conference on Visualization, pp. 44–51.
14. Cortes, C., Fisher, K., Pregibon, D., and Rogers, A. (2000). Hancock: A Lan-guage for Extracting Signatures from Data Streams. Proceedings of the Sixth ACM International Conference on Knowledge Discovery and Data Mining, pp. 9–17.
15. Das, G., Gunopulos, D., and Mannila, H. (1997). Finding Similar Time Series.
Proceedings of the First European Conference on Principles of Data Mining and Knowledge Discovery, pp. 88–100.
16. Das, G., Lin, K-I., Mannila, H., Renganathan, G., and Smyth, P. (1998). Rule Discovery from Time Series. Proceedings of the Fourth ACM International Conference on Knowledge Discovery and Data Mining, pp. 16–22.
17. Deng, K. (1998). OMEGA: On-Line Memory-Based General Purpose Sys-tem Classifier. PhD thesis, Robotics Institute, Carnegie Mellon University.
Technical Report CMU-RI-TR-98-33.
18. Domingos, P. and Hulten, G. (2000). Mining High-Speed Data Streams. Pro-ceedings of the Sixth ACM International Conference on Knowledge Discovery and Data Mining, pp. 71–80.
19. Edelsbrunner, H. (1981). A Note on Dynamic Range Searching. Bulletin of the European Association for Theoretical Computer Science,15, 34–40.
20. Fountain, T., Dietterich, T.G., and Sudyka, B. (2000). Mining IC Test Data to Optimize VLSI Testing. Proceedings of the Sixth ACM International Conference on Knowledge Discovery and Data Mining, pp. 18–25.
21. Franses, P.H. (1998). Time Series Models for Business and Economic Fore-casting. Cambridge University Press, Cambridge, United Kingdom.
22. Galka, A. (2000). Topics in Nonlinear Time Series Analysis with Implications for EEG Analysis, World Scientific, Singapore.
23. Gandhi, H.S. (2003). Important Extrema of Time Series: Theory and Appli-cations. Master’s thesis, Computer Science and Engineering, University of South Florida. Forthcoming.
24. Gavrilov, M., Anguelov, D., Indyk, P., and Motwani, R. (2000). Mining the Stock Market: Which Measure is Best. Proceedings of the Sixth ACM Inter-national Conference on Knowledge Discovery and Data Mining, pp. 487–496.
25. Geva., A.B. (1999). Hierarchical-Fuzzy Clustering of Temporal Patterns and its Application for Time-Series Prediction. Pattern Recognition Letters, 20(14), 1519–1532.
26. Goldin, D.Q., and Kanellakis, P.C. (1995). On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. Proceedings of the First International Conference on Principles and Practice of Constraint Pro-gramming, pp. 137–153.
27. Graps, A.L. (1995). An Introduction to Wavelets. IEEE Computational Sci-ence and Engineering,2(2), 50–61.
28. Grenander, U. (1996). Elements of Pattern Theory, Johns Hopkins University Press, Baltimore, MD.
29. Guralnik, V., and Srivastava, J. (1999). Event Detection from Time Series Data. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 33–42.
30. Guralnik, V., Wijesekera, D., and Srivastava, J. (1998). Pattern-Directed Mining of Sequence Data. Proceedings of the Fourth ACM International Conference on Knowledge Discovery and Data Mining, pp. 51–57.
31. Gusfield, D. (1997). Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press, Cambridge, United Kingdom.
32. Han, J., Gong, W., and Yin, Y. (1998). Mining Segment-Wise Peri-odic Patterns in Time-Related Databases. Proceedings of the Fourth ACM International Conference on Knowledge Discovery and Data Mining, pp. 214–218.
33. Haslett, J., and Raftery, A.E. (1989). Space-Time Modeling with Long-Memory Dependence: Assessing Ireland’s Wind Power Resource. Applied Statistics,38, 1–50.
34. Huang, Y.-W. and Yu, P.S. (1999). Adaptive Query Processing for Time-Series Data. Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 282–286.
35. Ikeda, K., Vaughn, B.V., and Quint, S.R. (1999). Wavelet Decomposition of Heart Period Data. Proceedings of the IEEE First Joint BMES/EMBS Conference, pp. 3–11.
36. Keogh, E.J. and Pazzani, M.J. (1998). An Enhanced Representation of Time Series which Allows Fast and Accurate Classification, Clustering and Rele-vance Feedback. Proceedings of the Fourth ACM International Conference on Knowledge Discovery and Data Mining, pp. 239–243.
37. Keogh, E.J., and Pazzani, M.J. (2000). Scaling up Dynamic Time Warping for Data Mining Applications. Proceedings of the Sixth ACM International Conference on Knowledge Discovery and Data Mining, pp. 285–289.
38. Keogh, E.J., Chu, S., Hart, D., and Pazzani, M.J. (2001). An Online Algo-rithm for Segmenting Time Series. Proceedings of the IEEE International Conference on Data Mining, pp. 289–296.
39. Keogh, E.J. (1997). Fast Similarity Search in the Presence of Lon-gitudinal Scaling in Time Series Databases. Proceedings of the Ninth IEEE International Conference on Tools with Artificial Intelligence, pp. 578–584.
40. Kim, S.-W., Park, S., and Chu, W.W. (2001). An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases.
Proceedings of the Seventeenth International Conference on Data Engineer-ing, pp. 607–614.
41. Lam, S.K. and Wong, M.H. (1998). A Fast Projection Algorithm for Sequence Data Searching. Data and Knowledge Engineering,28(3), 321–339.
42. Last, M., Klein, Y., and Kandel, A. (2001). Knowledge Discovery in Time Series Databases. IEEE Transactions on Systems, Man, and Cybernetics, Part B,31(1), 160–169.
43. Lee, S.-L., Chun, S.-J., Kim, D.-H., Lee, J.-H., and Chung, C.-W.
(2000). Similarity Search for Multidimensional Data Sequences. Pro-ceedings of the Sixteenth International Conference on Data Engineering, pp. 599–608.
44. Li, C.-S., Yu, P.S., and Castelli, V. (1998). MALM: A Framework for Mining Sequence Database at Multiple Abstraction Levels. Proceedings of the Sev-enth International Conference on Information and Knowledge Management, pp. 267–272.
45. Lin, L., and Risch, T. (1998). Querying Continuous Time Sequences. Pro-ceedings of the Twenty-Fourth International Conference on Very Large Data Bases, pp. 170–181.
46. Lu, H., Han, J., and Feng, L. (1998). Stock Movement Prediction and N-Dimensional Inter-Transaction Association Rules. Proceedings of the ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Dis-covery, pp. 12:1–7.
47. Maurer, K., and Dierks, T. (1991). Atlas of Brain Mapping: Topographic Mapping of EEG and Evoked Potentials, Springer-Verlag, Berlin, Germany.
48. Niedermeyer, E., and Lopes DaSilva, F. (1999). Electroencephalography: Basic Principles, Clinical Applications, and Related Fields. Lippincott, Williams and Wilkins, Baltimore, MD, fourth edition.
49. Park, S., Lee, D., and Chu, W.W. (1999). Fast Retrieval of Similar Subse-quences in Long Sequence Databases. Proceedings of the Third IEEE Knowl-edge and Data Engineering Exchange Workshop.
50. Park, S., Chu, W.W., Yoon, J., and Hsu, C. (2000). Efficient Searches for Sim-ilar Subsequences of Different Lengths in Sequence Databases. Proceedings of the Sixteenth International Conference on Data Engineering, pp. 23–32.
51. Park, S., Kim, S.-W., and Chu, W.W. (2001). Segment-Based Approach for Subsequence Searches in Sequence Databases. Proceedings of the Sixteenth ACM Symposium on Applied Computing, pp. 248–252.
52. Perng, C.-S., Wang, H., Zhang, S.R., and Scott Parker, D. (2000). Land-marks: A New Model for Similarity-Based Pattern Querying in Time Series Databases. Proceedings of the Sixteenth International Conference on Data Engineering, pp. 33–42.
53. Policker, S., and Geva, A.B. (2000). Non-Stationary Time Series Analysis by Temporal Clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part B,30(2), 339–343.
54. Pratt, K.B. and Fink, E. (2002). Search for Patterns in Compressed Time Series. International Journal of Image and Graphics,2(1), 89–106.
55. Pratt, K.B. (2001). Locating patterns in discrete time series. Master’s thesis, Computer Science and Engineering, University of South Florida.
56. Qu, Y., Wang, C., and Wang, X.S. (1998). Supporting Fast Search in Time Series for Movement Patterns in Multiple Scales. Proceedings of the Seventh International Conference on Information and Knowledge Management, pp.
251–258.
57. Sahoo, P.K., Soltani, S., Wong, A.K.C., and Chen, Y.C. (1988). A Survey of Thresholding Techniques. Computer Vision, Graphics and Image Processing, 41, 233–260.
58. Samet, H. (1990). The Design and Analysis of Spatial Data Structures.
Addison-Wesley, Reading, MA.
59. Sheikholeslami, G., Chatterjee, S., and Zhang, A. (1998). WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases.
Proceedings of the Twenty-Fourth International Conference on Very Large Data Bases, pp. 428–439.
60. Singh, S., and McAtackney, P. (1998). Dynamic Time-Series Forecasting Using Local Approximation. Proceedings of the Tenth IEEE International Conference on Tools with Artificial Intelligence, pp. 392–399.
61. Stoffer, D.S. (1999). Detecting Common Signals in Multiple Time Series Using the Spectral Envelope. Journal of the American Statistical Associa-tion, 94, 1341–1356.
62. Yaffee, R.A. and McGee, M. (2000). Introduction to Time Series Analysis and Forecasting, Academic Press, San Diego, CA.
63. Yi, B.-K, Sidiropoulos, N.D., Johnson, T., Jagadish, H.V., Faloutsos, C., and Biliris, A. (2000). Online Data Mining for Co-Evolving Time Series.
Proceedings of the Sixteenth International Conference on Data Engineering, pp. 13–22.
CHAPTER 4