Cuadro resumen de las actividades - Diseño de las actividades

5. Propuesta de intervención

5.4. Diseño de las actividades

5.4.1. Cuadro resumen de las actividades

This research was supported by the National Science Council of the Republic of China under contract NSC94-2213-E-390-005.

References

1. Au W H, Chan K C C (2004) Mining fuzzy rules for time series classiﬁcation. The 2004 IEEE International Conference on Fuzzy Systems, Vol. 1, pp. 239–244 2. Agrawal R, Psaila G, Wimmers E L, Zait M (1995) Querying shapes of histories.

The 21st International Conference on Very Large Databases, pp. 502–514 3. Agrawal R, Srikant R (1994) Fast algorithm for mining association rules. The

International Conference on Very Large Databases, pp. 487–499

4. Chen S M, Hwang J R (2000) Temperature prediction using fuzzy time series. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, Vol. 30, No. 2, pp. 263–275

5. Hettich S, Bay S D (1999) The UCI KDD Archive, Department of Information and Computer Science, University of California, Irvine, CA

6. Hong T P, Kuo C S, Chi S C (1999) Mining association rules from quantitative data. Intelligent Data Analysis, Vol. 3, No. 5, pp. 363–376

7. Hong T P, Kuo C S, Chi S C (2001) Trade-oﬀ between time complexity and number of rules for fuzzy mining from quantitative data. International Journal of Uncertainty, Fuzziness and Knowledge-based Systems, Vol. 9, No. 5, pp. 587–604 8. Indyk P, Koudas N, Muthukrishnan S (2001) Identifying representative trends in massive time series data sets using sketches. The 26th International Conference on Very Large Data Bases, pp. 363–372

9. Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduc- tion for fast similarity search in large time series databases. Journal of Knowl- edge and Information Systems, Vol. 3, No. 3, pp. 263–286

10. Lee Y C, Hong T P, Lin W Y (2004) Mining fuzzy association rules with multiple minimum supports using maximum constraints. Lecture Notes in Computer Science, Vol. 3214, pp. 1283–1290

11. Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. The IEEE International Conference on Data Mining, pp. 370–377 12. Song Q, Chissom B S (1993) Fuzzy time series and its models. Fuzzy Sets

System, Vol. 54, No. 3, pp. 269–277

13. Udechukwu A, Barker K, Alhajj R (2004) Discovering all frequent trends in time teries. The 2004 Winter International Symposium on Information and Commu- nication Technologies, pp. 1–6

14. Watanabe N (2004) A fuzzy rule based time series model. The IEEE Annual Meeting on Fuzzy Information, Vol. 2, pp. 936–940

15. Yi B K, Faloutsos C (2000) Fast time sequence indexing for arbitrary Lp norms. The 26th International Conference on Very Large Databases, pp. 385–394

I-Jen Chiang1,3_{, Tsau Young (‘T. Y.’) Lin}2_{, Hsiang-Chun Tsai}3

Jau-Min Wong3_{, and Xiaohua Hu}4

1 _{Graduate Institute of Medical Informatics, Taipei Medical University,} 205, Wu-Hsien Street, Taipei, Taiwan, ROC

[email protected]

2 _{Department of Computer Science, San Jose State University, One Washington} Square, San Jose, CA, USA

[email protected]

3 _{Graduate Institute of Biomedical Engineering, National Taiwan University, No.1,} Sec. 1, Jen-Ai Road, Taipei, Taiwan, ROC

College of Information Science and Technology, Drexel University, Philadelphia, PA 19104, USA

[email protected]

Summary. To organize a huge amount of Web pages into topics, according to

their relevance, is the eﬃcient and eﬀective method for information retrieval. Latent Semantic Space (LSS) naturally in the form on some geometric structure inCom- binatorial Topologyhas been proposed for unstructured document clustering. Given a set of Web pages, the set of associations among frequently co-occurring terms in them forms naturally a CONCEPT, which is represented as a set of connected componentsof the simplicial complexes. Based on these concepts, Web pages can be clustered into meaningful categories.

1 Introduction

To adequately handle documents, a methodology to represent or to reveal their latent semantics are needed. To date, no universally accepted eﬀective methodology has been discovered. In previous paper [15], we have pictured the latent semantics geometrically and call it the Latent Semantic Space (LSS) of the given set of documents. We take the key terms as vertices and visualize the term-associations(frequent co-occurring terms) as simplicial complex in LSS. Our thesis has been: a maximal connected component represents a CONCEPT in LSS of a collection of documents. However, in [15], we have not explored the full thesis, we consider only the PIMITIVE COMCEPTs of the highest dimension. Technically, we consider only the maximal connect components of the skeleton of the highest layer. In this paper, we explore the full notion

I-J. Chiang et al.: Latent Semantic Space for Web Clustering, Studies in Computational Intelligence (SCI)₁₁₈, 61–77 (2008)

of PRINITIVE CONCEPTs and the results are very encouraging.1 _These

results can directly obtained from search engines. All the returned results are automatically clustered into diﬀerent topics. The authoritative web pages in each topic are ranked based on how similar web pages belong to the topic. The experimental results indicate that we have an eﬀective way to organize the large amount of return from a web query.

Internet is an information ocean. How to marshal large amount of returned web pages, paragraphs or sentences is the key issue. Roughly speaking, we de- compose (triangulate, partition, granulate) LSS of documents (e.g., returned web pages or sentences) intosimplicial complexin combinatorial topology [23], which could be viewed a special form of hypergraphs. However, we should note that the notion of simplicial complexes is actually predated that of hypergraphs about half a century, even though the latter notion is more familiar to modern computer scientists.

Let us recall some examples to illustrate the main intuition. The association that consists of “wall” and “street” denotes some ﬁnancial notions that have meaning beyond the two nodes, “wall” and “street”. This is similar to the notion of open segment (v0, v1)) that represents one dimensional geo-

metric object, 1-simplex, that carries information beyond the two end points. In general, an r-association represents some semantic generated by a set of

r keywords, may have more semantics or even have nothing to do with the individual keywords. A mathematical structure that reﬂects such phenomena is the notion of simplicial complex in combinatorial topology; see Sect. 3.

The thesis of this paper is that the simplicial complex of term-associations reﬂects the structure of the concepts in LSS of the documents. Based on such conceptual structure, the documents (returned pages, paragraph, or sentences) can be eﬀectively clustered.

In document El aprendizaje basado en proyectos y la educación musical Propuesta de intervención "por el mundo" (página 35-41)