SISIB'ÍAS DE TIENRA - EgE fif;H - CAtI. EtrruDIO DE CARGAS EI,ECTRICAS PARA. DIVISION DE INSE

EgE fif;H

7.4. SISIB'ÍAS DE TIENRA

The output may diﬀer considerably for diﬀerent approaches.

• Degrees of equivalency. Apart from the information that localization

algorithms exploit and how they manipulate different tools and resources, an other important class of dimensions concerns the form of the result these systems produce. The kind of equivalence between the ontological terms and its translations might be of importance. For example, ISO 5964 [ISO, 1985] defines a classification scheme for different types of equivalence between terms: exact equivalence, partial equivalence, single-to-multiple equivalence, inexact equivalence and non- equivalence. The simplest cases are one-to-one translations. However, in real world, one will often encounter n-to-m translations instead.

• Conﬁdence. Another signiﬁcant distinction in the output results, con-

cerns the confidence measures of the translations. Only recently have researchers started to investigate confidence measures for machine translation [Ueffing et al., 2003, Gandrabur and Foster, 2003, Blatz et al., 2004, Quirk, 2004]. Possible applications of the confidence measures include: i) post-editing, where words with low confidence could be marked as potential errors, ii) improving translation prediction accu- racy, iii) combining output from different machine translation systems: hypotheses with low confidence can be discarded before selecting one of the system translations [Akiba et al., 2004], or the word confidence scores can be used for generating new hypotheses from the output of different systems [Jayaraman and Lavie, 2005], or the confidence value can be employed for re-ranking [Blatz et al., 2004]. We consider the confidence measure a factor essential for any ontology localization system.

• Provenance. The knowledge of provenance and its eﬀects on the lo-

calization activity is another important factor to consider. Although there are no conclusive studies on whether provenance information about translation suggestions that combine diﬀerent techniques has an impact on quality and speed of revision [Teixeira, 2011], we believe that this information should be taken into account when analyzing and comparing the results of diﬀerent ontology localization systems. Dif- ferent dimensions could be proposed taking into account the levels of provenance information that a system could provide to stakeholders, for example, the resources or algorithms used.

5.9.4 Use case.

The ontology localization activity can contribute as a plausible solution to different applications. For example, a typical case of the ontology localization activity is the multilingual ontology matching (MOM) application. MOM refers to the process of establishing relationships among ontological resources from two or more independent ontologies where each ontology is labeled in a different natural language [Fu et al., 2009b, Trojahn et al., 2008]. This activity requires support of ontology localization because MOM is achieved by first localizing the labels of a source ontology into the target natural language. Then by applying monolingual ontology matching techniques to the translated source ontology and the target ontology it is possible to establish matching relationships.

We believe that even though the cases might not directly be reflected in input, process, or output, they definitely influence the complete setting of the localization process. Therefore, the cases must be considered a factor for distinction.

5.10 Summary of the Chapter

Ontology localization has diﬀerent facets; one of these facets is the translation. To automatize the translation task, a variety of techniques can be used. The classiﬁcations discussed in this chapter provide a common con- ceptual basis to analyze the advantages and shortcomings of each technique with regard to the localization activity.

We have provided such classifications based on a way of modeling the context used for the translation on one side and the kind of technology used to localize an ontology into different natural languages on an other. Once the different translation techniques have been identified, we have presented the strategic issues involved in creating localization solutions. In particu- lar, this involves the composition of basic translation techniques and the combinations of their results.

We have ﬁnished this chapter describing some high level factors that can be used to classify the approaches used to localize an ontology into diﬀerent natural languages.

Lyfe-Cycle Model and

Architecture

In this chapter we discuss two important issues related to ontology localization activity: life-cycle and system architecture. As we discussed in the introduction chapter, a typical localization project involves several tasks that extend far beyond the translation process itself. This is why the ﬁrst goal of this chapter is to describe the life-cycle model by means of the represen- tation of the major components of this activity and their interrelationships in a graphical framework that can be easily understood and communicated. As second goal of this chapter, we outline our approach to the deﬁnition of a system architecture that supports the ontology localization activity. The proposed model comprises the system components, the externally vis- ible properties of those components, the relationships (e.g., the behavior) between them, and provides a base from which localization systems can be developed.

First, we give an intuitive view of the whole localization activity, includ- ing the translation phase, which was extensively described in the previous chapters. Later in this chapter, we introduce some basic requirements for an ontology localization system. Then, we will propose a system architecture based on the ontology localization life-cycle model, considering also the system requirements identified from different works in related areas. Af- ter defining the architecture, we will see the main modules needed to allow such an ontology localization approach in distributed and collaborative environments. Finally, we describe general comments and different technical details related to the LabelTranslator system, our approach to perform an automated localization in distributed and collaborative environments.

In document CAtI. EtrruDIO DE CARGAS EI,ECTRICAS PARA. DIVISION DE INSENIffiIAS. HArcT,D HTIHBERÍO PAIOü'ÍINO }IAYOR CENTRAI,ES TEI. (página 166-175)