Focusing inthe ontologies that do reusein any ofthe considered types (ImportRa- tio, ReferenceRatio and ReferenceByImportRatio), we can observe that the trend is to adopt a type ofreuse for each ontology, that is, most ofthereuse is either based on owl:imports statements or based on referencing element URIs, however it is scarce to find ontologies combining both types ofreuse at the same level. One ofthe ontologies that is completely based on referenced elements from other ontologies is the case of “gnm”. The aim of “gnm” ontology is to establish mappings between DBpedia and Geonames ontologies, therefore all the appearing classes and properties are defined in DBpedia and Geonames ontologies instead ofinthe “gnm” namespace. However, for those cases with a reuse ratio higher than 60% the tendency is to achieve this level by importing ontologies. Not surprisingly, it could be due to the owl:imports statements mechanism that include and its transitivity.
Financial Industry Business Ontology (FIBO)  is a collaborative industry initiative to describe financial data standards using semantic technology. FIBO has been authored by Enterprise Data Management (EDM) council under the technical governance ofthe Object Management Group (OMG). FIBO has two distinct aspects: a business ontology and a presentation for business readability. FIBO is released in discrete ontologies by subject area: (i) Business Entities; (ii) Security, Loans, Derivatives and (iii) Corporate Actions and Transactions. At the time of this writing, only the first specification for Business Entities has been made public. The specification identifies a taxonomy of basic entities: Human Being, Legal Person, Organization and Legal Entity. This taxonomy is extended with other derived entities, such as Minor, Natural Person, Artificial Person (Company Limited by Guarantee, Legally Incorporated Partnership, Founda- tion or Incorporated Company), Formal Organization (Trust, Partnership or Incorporated Company) and Informal Organization. In addition, theontology models concepts such as control and ownership.
The approach to knowledge discovery proposed in this paper will be experimental- ly tested through confrontation against distributed, heterogeneous data sources con- necting the domains of tourism and economy. Particularly, we will apply our knowledge discovery cycle on real-world datasets to produce knowledge related to tourism inthe Canary Islands, and to the way it is influenced by the economic state of European countries. These corresponding datasets will originate both from the insti- tute of statistics ofthe Canary Islands (ISTAC) 1 and from open datasets available on the Web regarding macro-economic indicators  (e.g. the ‘world bank’ datasets 2 ). Indeed, immense bases ofdata about global, national and regional economic condi- tions exist that have the potential to provide insights about the economic dependen- cies and future economic potential of particular regions ofthe world. However, these data are being largely underexploited because the large number of interconnections between these data and data from other domains, the heterogeneity ofthe applicable models and patterns, as well as the general ambiguity associated with these areas make it difficult to bootstrap a knowledge discovery process.
During the coverage experiments we analysed two kinds ofdata: (a) datasets created by other researchers and annotated with opinion mining data; (b) ser- vices available on-line that use opinion mining for various goals. The final list consisted of 5 research datasets and 8 services, for each we analysed thedata that is exposed and provided Marl mappings. Next, we calculated the coverage as an amount of properties that were possible to describe with Marl over the total amount ofdata properties used in a dataset. Inthe first experiment we considered all the dataset fields and the average coverage we got was 64%. How- ever, it has to be noted that the individual characteristics ofthedata sources varied a lot. According to ontology design goals presented by Noy et al.  one ofthe characteristics of good design is not to cover the very individual elements of datasets. Therefore, after removing the dataset fields that did not repeat at least once, we ran the experiment again and got the average coverage of 76%. The results ofthe experiments have been summarized in Table 2.
The current excitement about the vision of a Semantic Web forms an additional stimulant for the interest in ontologies. In this vision, ontolo- gies have a role in defining and relating concepts that are used to describe data on the web. The use of ontologies on the web emphasizes particu- lar research topics [Gru02, Hef00, Kle01, Kle02b, Mae04, Sun02]. Reuseof ontologies will be a central aspect because ofthe interlinked nature ofthe web. Inthe idea of a Semantic Web, relative small ontologies link to many other ontologies to import vocabularies and domain theories. The distributed and dynamic character ofthe web also emphasizes an issue that was not yet extensively studied: the evolution and ver- sioning of ontologies [Kle02b, Mae04, Noy03].
The current technical state does not solve the problem in applications where complex ontologies should be created and managed collaboratively and in highly dynamic, multilingual and constantly evolving environments. There are several tools such Prot´ eg´ e 2 for editing ontologies, R 2 O  for making mappings between ontologies and data bases, RDF-Gravity for visualising 3 , theontology alignment API and Server 4 , etc. Despite there are a lot them that solve many problems such ontology learning, ontology upgrade and ontology alignment, these tools are stand alone and make the process of managing ontological information very complex basing the interoperability between them in exporting and importing processes that sometimes degrades the information. With respect to methodolo- gies, Methontology  and On-To-Knowledge  do not deﬁne a workﬂow for editing ontologies taking into account the roles involved intheontology develop- ment. Also these methodologies are deﬁned for building ontologies from scratch not taking into account thereuseof existing ones. All the aforementioned ap- proaches do not consider collaborative and distributed construction of ontologies when developers are geographically distributed using diﬀerent languages. In fact the ﬁrst method that included a proposal for collaborative construction was Co4  and the ﬁrst tool was Tadzebao and WebOnto.
over the world to make their data publicly available. We will have to assume that when using meaningful local names, these agents will use their own languages. For certain domains, this is also natural, since conceptualizations differ from culture to culture, and this is reflected inthe language used to describe them; for some thoughts about thereuseof available conceptualizations in different cultural and language settings we refer the interested reader to (Cimiano et al., 2009). Therefore, leaving technical problems aside, we believe that the use of opaque local names avoids any bias towards English (or any other language) and is a better option for ontologies that might support natural language descriptions in several languages. Indeed, this solution has been adopted by multilingual semantic resources such as EuroWordNet (Vossen, 1998) and the Agrovoc Thesaurus ofthe FAO (Liang, et al., 2008), and was also the solution provided inthe transformation ofthe FRBR models and the ISBD standard into RDF, as reported in section 3.2.2. In this case, labels are to be used to document theontologyin natural languages, as also recently suggested by Tim Berners-Lee 10 .
The previous section dealt with semantic and syntactic interoperability between appli- cations and services as key aspects to achieve better data exchange and integration ca- pabilities, as well as reasoning. In addition to these two conditions, in order to solve the problem of vast data resources being isolated from one another and developed indepen- dently in their data silos, which hinders their discovery and reuse, the notion of linking resources comes into play. TheLinkedData (LD) paradigm emerges as a series of best practices and principles for exposing, sharing, and connecting data on the Web (Bizer et al., 2009), independently ofthe domain. These principles state that unique re- source identifiers (URIs) (Berners-Lee et al., 2004) should be used to name things in a way that allows people to look them up, find useful information represented with standard formalisms, and discover more things that are linked to those resources. LD is transforming the Web into a vast cloud of interlinked data (commonly referred to as the “Web ofData”), in which resources are linked across datasets and sites, and where facts and related knowledge are available for consumption by advanced, knowledge-based software agents as well as by humans through suitable interfaces.
We also note that links are being built between library resources and resources originating in other organizations or domains. For example, VIAF aggregates authority records from various library agencies, identifies the primary entities involved and, where possible, links them to DBpedia, a LinkedData extraction of Wikipedia. The semantic alignment for Jane Austen in VIAF, Wikipedia,and DBpedia, for example, illustrates one ofthe expected benefits ofLinkedData, which is that data can be easily networked irrespective of its origins. In this way the library domain can benefit from re-using data from other fields, while library data can contribute to initiatives that did not originate inthe library community. The creation of alignments will benefit from the availability of better tools for linking. Much effort has been put into computer science research areas such as Ontology Matching. This leads to implementations based, for example, on string matching and statistical techniques. These efforts have tended to focus on metadata element sets and typically are not ready to be applied more generally to the (often huge) datasets and value vocabularies ofthe library domain. Recent generic tools for linking data include Silk - Link Discovery Framework, Google Refine, and Google Refine Reconciliation Service API. Nonetheless, the community still needs to gain experience in their use, to share results of this experience, and possibly to build tools better suited to library LinkedData.
To address these issues, the OntoLex community group has been launched in 2012 to develop a new model based on lemon (Lexicon Model for Ontologies) , to represent lexical resources relative to ontologies. This model has been developed around the principles of being open, RDF-native, minimalist and linguistically sound. Similarly, the NIF (NLP Interchange Format) ontology  was developed as a model for representing stand-off annotation of corpora using RDF. These models address some forms of linguistic LD however do not cover all possible forms of linguistic data, and as such there is still a need to develop further models for multimodal resources, typological databases and many other kinds of linguistic data.
In this paper we present a revisited classifica- tion of term variation inthe light oftheLinkedData initiative. LinkedData refers to a set of best practices for publishing and con- necting structured data on the Web with the idea of transforming it into a global graph. One ofthe crucial steps of this initiative is the linking step, in which datasets in one or more languages need to be linked or connect- ed with one another. We claim that the link- ing process would be facilitated if datasets are enriched with lexical and terminological information. Being that the final aim, we propose a classification of lexical, termino- logical and semantic variants that will be- come part of a model of linguistic descriptions that is currently being proposed within the framework ofthe W3C Ontology- Lexica Community Group to enrich ontolo- gies and LinkedData vocabularies. Examples of modeling solutions ofthe different types of variants are also provided.
Taking into consideration the metadata definition cited by Comber, Fisher, Harvey, Gahegan, and Wadsworth (2006), “ information that helps the users assess the usefulness of a dataset relative to their problem, ” this research proposes an integrated way to get that kind of information. Once open data portals have implemented the feedback resources sug- gested inthe previous element, users will provide a particular level of usefulness ofthe available dataset. Determining the number or kind of uses for a specific dataset is quite difficult, or almost impossible. Therefore, combining the possible uses from a data producer perspective, plus the use case or comments provided by other users about their experience using thedata, whether positive or negative, could be the way to understand if the dataset is suitable for other particular problems. Related to linkeddata, there was another resource suggested by data users — the inclusion of a semantic web in open data portals. The feedback provided by other users can guide newcomers to understand how thedata can fit their problem. That is also the reason why Figure 2 includes an arrow between the community ofreuse and user-focused meta- data elements; a semantic integration will cooperate to get the information required by the user-focused metadata, and users will be encouraged to participate by writing comments and rating the released datasets.
Data formats that may be readily parsed by computer programs without access to proprietary libraries. For example, CSV, TSV and RDF formats are machine readable, but PDF and Microsoft Excel are not. Creating and publishing data following LinkedData principles helps search engines and humans to find, access and re-use data. Once information is found, computer programs can re-use data without the need for custom scripts to manipulate the content.
which publishers are said which labels to use in each case. However, the process rarely follows this order. As vocabularies gain popularity, their adoption increases and multilingual needs appear to support interoperability. In fact, widespread adoption comes first, and, then, one realizes the benefits ofthe multilingual aspect. For these reasons, models such as lemon allow to maintain the model or vocabulary “as it is”, and enrich it with multilingual information at any stage ofthe process. Inthe specific case ofthe DCAT vocabulary, and taken into account its general adoption, the next step would involve an analysis ofthe catalogs and portals that implement it to identify the labels used by the various publishers in different languages. All those labels, or preferably, the ones that better express the meaning ofthe vocabulary terms should be captured inthe linguistic model and recognized as preferred labels in each language. The benefit of this approach is that the model would take advantage of labels (variants or translations) that are popular and accepted by publishers, and would not “impose” the use of some labels that may end up not being meaningful for users. The model would also “leave the door open” for new linguistic needs without interfering with the original vocabulary. Moreover, we believe that it should be made following a conciliatory approach in which different options are welcomed and integrated, and in which different communities can participate in proposing terms and translations in their own languages, thus building it in a cooperative way. All in all, the enrichment ofthe vocabulary with multilingual linguistic information would contribute to a wider adoption and increased understanding and interoperability.
although it is updated every few minutes. On the other hand, some ofthe metadata may not be accurate inthe case that the values ofthe metadata change more frequently than the regular update time. In fact, some ofthe dynamic metadata of BDII, such as freeCPU number, runningJobs or networking bandwidth, is usually incorrect as it is never updated on time. 2) Information Source Selector (ISS): The Information Source Selector (ISS) is used to find the most suitable in- formation source from the set of available sources, which are described as instances ofthe Information Source Ontology. Information sources can be any system (database, file, service, etc.) that contains relevant information. In Grid systems there are many redundant and geographically-distributed informa- tion sources available. For example, over 20 region BDII servers can be used to fetch information about the EGEE Computing Elements.
Traditionally, the process that cartography producers go through to create cartography databases is as follows: first, they identify real world features and give them names; then, they categorize the features and create models, i.e., schemata; finally, they introduce these feature types and their related instances in a database using its underlying syntax . Furthermore, as a consequence ofthe existence of multiple geospatial producers, it is quite common to find several databases describing, at least partly, the same geographical space. Usually data are collected for specific purposes, and are very different from one source to another .
For the evaluation ofthe lemon source editor, we performed an evaluation focused on the usability and coverage ofthe model and the tool. On the one hand, our objective was to find out in how far the lemon model is capable of represent- ing the lexical data contained in a resource such as Wik- tionary and whether the model matches the requirements that users have for their applications. On the other hand, our purpose is to know users’ opinions on how usable the sys- tem is, if the resulting lexicon is as intended, if they easily understand the lexical information captured inthe model, if they find it easy to edit, and if the collaborative functional- ity helps them in creating lexica in an intuitive manner. For these purposes, we conducted an initial set of evaluations with five Masters students, three studying Computer Sci- ence, one studying Linguistics and one studying Cognitive Science. They were given a short explanation ofthe sys- tem and allowed to work with the system for about an hour. They were given the task of representing a single entry from Wiktionary for a common term (hence an entry with much information) within lemon source. Afterwards, they were asked to answer a questionnaire with ten questions as fol- lows (partially abridged):
HUPSON: Human Physiology Simulation Ontology; ICD9CM: International Classification of Diseases, version 9 - Clinical Modification; ICPC: International Classification of Primary Care; LOINC: Logical Observation Identifier Names and Codes; MEDDRA: Medical Dictionary for Regulatory Activities; MEDLINEPLUS: MedlinePlus Health Topics; MESH: Medical Subject Headings; MP: Mammalian Phenotype Ontology; NCIT: National Cancer Institute Thesaurus; NDDF: National Drug Data File; NDFRT: National Drug File - Reference Terminology; OMIM: Online Mendelian Inheritance in Man; PDQ: Physician Data Query; RCD: Read Codes, Clinical Terms version 3; RXNORM: RxNORM; SNOMEDCT: Systematized Nomenclature of Medicine - Clinical Terms; SWEET: Semantic Web for Earth and Environment Technology Ontology; SYMP: Symptom Ontology; VANDF: Veterans Health Administration National Drug File; VSO: Vital Sign Ontology
Analysis ofthelandscape structure and evolution: We mapped the territory to the scale of 1:25 000 inthe years 2004, 1984, and 1960. The mapping work required the use of high- resolution, remote sensor images, aerial pho- tographs, and SPOT satellite images. The base map at the 1:25 000 scale was from the Insti- tuto Geográfico Agustín Codazzi (IGAC.) This mapping was complemented by information from NIMA (National Imagery and Mapping Agency) inthe Raster 15 m resolution format (1997). The interpretation and digitalization ofthe base and thematic (cover and live fences) mapping was undertaken using geographic information system tools and with the help of geographic information system (GIS) programs (Arcview, ERDAS IMAGINE, and ArcGis). The process followed guidelines laid down inthe development ofthe map of Colombian Andes ecosystems (Rodríguez et al. 2004). The information was completed by fieldwork with geo-referencing by means of global positioning system (GPS).