• No se han encontrado resultados

Sistema Electoral, Sensibilidad y Compromiso

2. Capítulo Segundo:

2.3 Fomento de la Redistribución Equitativa y Derecho Electoral

2.3.2 Sistema Electoral, Sensibilidad y Compromiso

The MAGE-OM model provides an instrument for creating structured documents of MIAME compliant information. This is already a vast improvement for data mining but it could be even more simplified if standardization was also applied to natural language terms. The model does not prescribe terms for the description of experimental annotation. Free-form annotations are problematic: It is hard to search for content in a free-text database where descriptions of the same process or material may be described with different terms. It is also hard for the experimenter to infer the meaning of different terms and to know how they should be used within an experimental annotation. Therefore, a way of defining terms for experimental annotations is required.

One way of reducing free-text descriptions is to represent a hierarchical odering terms in a hierarchical structure of classes directly in the data model. Therefore, classes for each possible value (e.g. an organism part or instrument) are required. This approach is inflexible, because it introduces a large number of classes having no attributes and methods to differentiate them, except their name. This leads to an inconsistent data representation as naming conventions change. As an example, a taxonomy of organisms could be added to the data-model resulting in a class for every organism. If a new organism is discovered, the hierarchy has to be changed, affecting data representation. By re-arranging an embedded taxonomy or removing branches, formerly valid documents would become invalid.

The problem of embedded term definitions is often addressed by so called con- trolled vocabularies. A controlled vocabulary is a set of terms which can be used in a specific context, being defined separately from the data-model. For a simple controlled vocabulary, there is no assumption of a structure of terms or dependence relations between terms.

An ontology can be used as a special case of a controlled vocabulary adding re- lations between terms. The term ontology stems from philosophy where ontology is the branch of metaphysics concerned existence of things. Ontologies are an at- tempt to categorize existing things in a way that represents knowledge about them. Within computer science, the term ontology has been adopted for an application used in knowledge based systems.

Gruber (1993) defines an ontology as an explicit specification of a conceptual- ization. A conceptualization is defined as “the objects, concepts and other entities that are assumed to exist in some area of interest and the relationships that hold among them” (Genesereth and Nilsson, 1987). For such systems only those con- cepts exist that can be represented. The ontology thus defines the terms with which software (in this case called ‘agents’) can communicate about a given domain of interest without necessarily sharing the same knowledge base. The ontology then

should contain

• the names of entities in the domain • human readable textual descriptions

• and formal constraints for the use and interpretation of terms

Although this definition has been made with artificial intelligence systems in mind, it also seems suitable for the special case of providing a controlled vocabulary for the annotation of experiments.

Each application is likely to operate on its own distinct domain of application while a portion of shared information needs to be exchanged with other applications. The only difference between Gruber’s definition and experimental annotations is the level of optimism for automatic generation of data and queries between applications. The main use case of an ontology is formal annotation and interpretation by a person that chooses terms from the ontology. Database queries for experiments can also be based on ontology terms. Still formal constraints are important to enable software to assist the user in the correct usage of the ontology. A prominent example is the Gene Ontology which provides a hierarchy of classes to annotate the function of genes (Ashburner et al., 2000).

To annotate microarray experiments, a customized ontology is needed. The cre- ation of such an ontology was undertaken within the MGED by the ontology work- ing group. In 2002 a preliminary ontology was published (Stoeckert et al., 2002). This ontology contains a hierarchical structure of classes and terms for all attributes in the MAGE-OM where an ontology entry can be referenced. The MGED ontology is a reduced ontology in that it has a hierarchical structure of classes and individuals allowing only inheritance relations.

The MGED ontology has also been enriched by formal constraints on the proper use of ontology terms. These constraints are also realized implicitly by MAGE-OM. There exists a one-to-one relation between the names of associations to Ontology Entry classes and the names of the classes within the ontology. Therefore, MAGE- OM provides the constraints or formal syntax for the use of the MGED ontology terms.

The MGED ontology has been originally implemented using the XML-application DARPA Agent Markup Language (DAML+OIL)10, which can be used for specifi-

cation of ontologies. The resulting files can be easily parsed by computer programs. The MGED ontology is now publicly available in several other XML-formats, in- cluding the Web Ontology Language (OWL)11, which defines an open standard for

the representation and the algorithmic processing of ontological data provided by the W3C. There is also web-browsable version of the MGED ontology (see Fig- ure 4.2 on the facing page).

10

http://www.daml.org/2001/03/reference.html

11

4.1. Standardization and Specification 59

Figure 4.2: Screenshot of the MGED ontology web pages. On the left, the page

shows a list of all available classes in the ontology. In the main frame, detailed information of a class (in this case ExperimentDesignType which is referenced in the corresponding MAGE-OM entry to annotate an Experiment) is displayed. The class hierarchy is depicted as a tree list. The subclasses which contain further sub-classes of the ExperimentDesignType class are depicted at the bottom under the caption ’Usage’.