5. COMUNICACIÓN CON EL CENTRO DE CONTROL
5.2. TRASCENDENCIA DE LA INTEGRACIÓN DE LOS ACR AL SISTEMA
Ontology Design Patterns (ODPs) are a middle out way for developing ontologies. They can be viewed as an extremely lightweight version of design principles alike found in foundational ontologies, but then with less ‘clutter’. That is, they can be cleverly modularised foundational ontology fragments that serve as design snippets for good modelling practices. They also can be viewed as a way of bottom-up pattern finding that is then reused across the ontology and offered to others as a ‘best practices’ design solution for some modelling aspect. ODPs have been proposed first a while ago [BS05, Gan05], and have gained some traction in research in recent years with various ideas and proposals. There is, therefore, no clear single, neat, core to extract from it and describe at present. A clear, informal overview is described in [GP09], but terms, descriptions, and categorisations are being reworked [FGGP13], and the sub-field better characterised with respect to the issues for using ODPs and possible research directions [BHJ+15].
Let us first introduce some definitions for a pattern for a specific ontology and their uses and then proceed to types of patterns. The definitions are geared to the OWL language, but one can substitute that for another language of choice.
Definition 7.3 (Language of pattern instantiation [FK17]). OWL Ontology O
with language specification adhering to the W3C standard [MPSP09], which has classes C ∈ VC, object properties OP ∈ VOP, data properties D ∈ VD, data types DT ∈ VDT of the permitted XML schema types, axiom components (‘language
features’) X ∈VX, and such that Ax∈VAx are the axioms.
The ‘axiom components’ include features such as, among others, subsumption, transitivity, existential quantification, and cardinality, which can be used according to the syntax of the language. A pattern itself is a meta-level specification, in a similar fashion as stereotyping in UML. Just in case a pattern also includes ‘reserved’ entities from, say, a foundational ontology, they get their own entry in the vocabulary to clearly distinguish them.
Definition 7.4 (Language for patterns: Vocabulary V [FK17]). The meta-level (second order) elements (or stereotypes) for patterns are:
• class C ∈VC as C in the pattern;
• object property OP ∈VOP as R in the pattern; • data property D∈VD as D in the pattern; • data type DT ∈VDT as DT in the pattern;
• reserved set of entities from a foundational ontology, as F in the pattern; where added subscripts i with 1 ≤i ≤n may be different elements. Two elements in the vocabulary are calledhomogeneousiff they belong to the same type, i.e., they are both classes, or both object properties, and so on. Elements can be used in axioms Ax ∈ VAx that consists of axiom components x ∈ VX in the pattern such
that the type of axioms are those supported in the ontology language in which the instance of the pattern is represented.
With these ingredients in place, one can then define an ontology pattern P as follows.
Definition 7.5 (Ontology Pattern P [FK17]). An ontology pattern P consists of more than one element from vocabulary V which relate through at least one axiom component from VX. Its specification contains the:
• pattern name;
• pattern elements from V;
• pattern axiom component(s) from VX;
• pattern’s full formalisation.
For instance, thebasic all-somepattern that we have seen as ‘macro’ in Section 7.2 has as specification ([FK17]):
• pattern name: basic all-some • pattern elements: C1, C2, R
• pattern axiom component(s): v, ∃
• pattern’s full formalisation: C1 v ∃R.C2
An instantiation of the basic all-some pattern in an ontology, say, the AWO, may be, e.g., Giraffev ∃drinks.Water.
As can be seen from the definition, they are referred to agnostically aspatterns— it may be a pattern realised in the ontology and some algorithm has to search for (as was the scope in [FK17]) as well as one defined separately and applied during the design phase and is therewith thus also in line with some newly proposed terminology [FGGP13]. Ontology patterns tend to be more elaborate than thebasic all-some pattern. For instance, one could specify a pattern for how to represent attributions with DOLCE’s Quality rather than an OWL data property, as was discussed in Section 6.1.1, or how to systematically approximate representing an n-ary inton binaries in OWL. Furthermore, there are broader options for ontology patterns. A selection of them with a few examples is as follows.
• Architecture pattern. This specifies how the ontology is organised. For instance, one could choose to have a modular architecture in the sense of sub-domains. An example of a fairly elaborate architecture is illustrated in Figure 7.4 for BioTop [BSSH08].
• Logical pattern. This deals with the absence of some features of a represen- tation language and how to work with that. The issue with n-aries in OWL is such an example.
• Content pattern. This pattern assists with representing similar knowledge in the same way for that particular ontology. Recalling the rules-as-you-go from thesauri bottom-up development, they can be specified as content patterns. A larger example is shown in Figure 7.5.
• ‘Housekeeping’ patterns, including so-called lexico-syntactic patterns. They refer to ensuring clean and consistent representations in the ontology. For instance, to write names in CamelCase or with dashes, and using IDs with labels throughout versus naming the classes throughout the ontology.
There are also practical engineering tasks in the process of using ODPs, such as a workflow for using ODPs and the usual requirements of documentation and metadata; recent first proposals include [FBR+16, KHH16].
Figure 7.4: BioTop’s Architecture, which links to both DOLCE and
BFO-RO and small ‘bridge’ ontologies to link the modules. (Source:
http://www.imbi.uni-freiburg.de/ontology/biotop).
7.7
Exercises
Review question 7.1. Why can one not simply convert each database table into an OWL class and assume the bottom-up process is completed?
Review question 7.2. Name two modelling considerations going from conceptual data model to ontology.
Review question 7.3. Name the type of relations in a thesaurus.
Review question 7.4. What are some of the issues one has to deal with when developing an ontology bottom-up using a thesaurus?
Review question 7.5. What are the two ways one can use NLP for ontology development?
Review question 7.6. Machine learning was said to use inductive methods. Re- call what that means and how it differs from deductive methods.
Review question 7.7. The least common subsumer and most specific concept use non-standard reasoning services that helps with ontology development. Describe in your own words what they do.
Exercise 7.1. Examine Figure 7.6 and answer the following questions. a. Represent the depicted knowledge in an OWL ontology. *
b. Can you represent all knowledge? If not: what not? *
c. Are there any problems with the original conceptual data model? If so, which one(s)? *
Input Catalyst Material Object Output Material transformation 1..* 1..* has
input outputhas
* 1..* Neighbourhood
time:Interval
* *
Figure 7.5: Example of a content OP represented informally on the left in UML class
diagram style notion and formally on the right. There is a further extension to this OPD described in [VKC+16] as well as several instantiations.
Figure 7.6: A small conceptual model in ICom (from its website
http://www.inf.unibz.it/∼franconi/∼icom; see [FFT12] for further details about the tool); blob: mandatory, open arrow: functional; square with star: disjoint complete, square with cross: disjoint, closed arrow (grey triangle): subsumption.
Exercise 7.2. Figure 7.7 shows a very simple conceptual data model in roughly UML class diagram notation: a partition [read: disjoint, complete] of employees between clerks and managers, plus two more subclasses of employee, namely rich employee and poor employee, that are disjoint from the clerk and the manager classes, respectively (box with cross). All the subclasses have the salary attribute restricted to a string of length 8, except for the clerk entity that has the salary attribute restricted to be a string of length 5. Another conceptual data model, in ORM2 notation (which is a so-called attribute-free language), is depicted in Figure 7.8, which is roughly similar.
a. When you reason over the conceptual data model in Figure 7.7, you will find it has an inconsistent class and one new subsumption relation. Which class is inconsistent and what subsumes what (that is not already explicitly declared)? Try to find out manually, and check your answer by representing the diagram in an OWL ontology and run the reasoner to find out. *
Consider the issue of how to deal with attributes and add the information that clerks work for at most 3 projects and managers manage at least one project. *
Figure 7.7: A small conceptual model in ICom (Source:
http://www.inf.unibz.it/∼franconi/∼icom).
Figure 7.8: A small conceptual model in ORM2, similar to that in Figure 7.7.
Exercise 7.3. Consider the small section of the Educational Resources Information Center thesaurus, below.
a. In which W3C-standardised (Semantic Web) language would you represent it, and why? *
b. Are all BT/NT assertions subsumption relations? *
c. There is an online tool that provides a semi-automatic approach to developing a domain ontology in OWL starting from SKOS. Find it. Why is it semi- automatic and can that be made fully automatic (and if so, how)?
Popular Culture BT Culture NT n/a
RT Globalization RT Literature RT Mass Media RT Media Literacy RT Films UF Mass Culture (2004) Mass Media BT n/a NT Films NT News Media NT Radio RT Advertising RT Propaganda RT Publications; UF Multichannel Programing (1966 1980) (2004) Propaganda
BT Communication (Thought Transfer) BT Information Dissemination NT n/a RT Advertising RT Deception RT Mass Media UF n/a
Exercise 7.4. In what way(s) may data mining be useful in bottom-up ontology development? Your answer should include something about the following three aspects:
a. populating the TBox (learning classes and hierarchies, relationships, con- straints),
b. populating the ABox (assertions about instances), and
c. possible substitutes or additions to the standard automated reasoning service (consistency checking, instance classification, etc.).
Exercise 7.5. Define a pattern for how to represent attributions with DOLCE’s
Qualityrather than an OWL data property.
Exercise 7.6. OWL permits only binary object properties, though n-aries can be approximated. Describe how they can be approximated, and how your OP would look like such that, when given to a fellow student, s/he can repeat the modelling of that n-ary exactly the way you did it and add other n-aries in the same way. * Exercise 7.7. Inspect the Novel Abilities and Disabilities OntoLogy for ENhancing Accessibility: adolena; Figure 7.9 provides a basic informal overview. Can (any of) this be engineered into an ODP? If so, which type(s), how, what information is needed to document an OP? *
Exercise 7.8. Figure 7.5 shows a content OP. How would you evaluate whether this is a good ODP? In doing so, describe your reasoning why it is, or is not, a good ODP. *
Function hasFunction Ability assistsWith / isAssistedBy Device Disability ServiceProvider isAffectedBy / affects ameliorates providedBy / provides requiresAbility Assistive Device Replacement
Device assistsWith / isAssistedBy Physical Ability Body
Part replaces
Figure 7.9: Informal view of theadolenaontology.
Exercise 7.9. Discuss the feasibility of the following combinations of requirements for an ontology-driven information system (and make an informed guess about the unknowns):
a. Purpose: science; Language: OWL 2 DL, or an extension thereof; Reuse: foundational; Bottom-up: form textbook models; Reasoning services: stan- dard and non-standard.
b. Purpose: querying data through an ontology; Language: some OWL 2; Reuse: reference; Bottom-up: physical database schemas and tagging; Rea- soning services: ontological and querying.
c. Purpose: ontology-driven NLP; Language: OWL 2 EL; Reuse: unknown; Bottom up: a thesaurus and tagging experiments; Reasoning services: mainly just querying.
You may wish to consult [Kee10a] for a table about dependencies, or argue upfront first.
Exercise 7.10. You are an ontology consultant and have to advise the clients on ontology development for the following scenario. What would your advice be, as- suming there are sufficient resources to realize it? Consider topics such as language, reasoning services, bottom-up, top-down, methods/methodologies. *
A pharmaceutical company is in the process of developing a drug to treat blood infections. There are about 100 candidate-chemicals in stock, categorised according to the BigPharmaChemicalsThesaurus, and they need to find out whether it meets their specification of the ‘ideal’ drug, codenameDruTopiate, that has the required features to treat that disease (they already know thatDruTopiatemust have as part a benzene ring, must be water-soluble, smaller than 1µm, etc). Instead of finding out by trial-and-error and test all 100 chemicals in the lab in costly experiments, they want to filter out candidate chemicals by automatic classification according to those DruTopiate features, and then experi- ment only with the few that match the desired properties. Thisin silico (on-the-computer) biomedical research is intended as a pilot study, and it is hoped that the successes obtained in related works, such as that of
the protein phosphatases and ideal rubber molecules, can be achieved also in this case.
7.8
Literature and reference material
A small selection of sample articles are the following ones, noting that there are, at the time of writing no ‘common reference papers’ on the topic:
1. L. Lubyte, S. Tessaris. Automatic Extraction of Ontologies Wrapping Re- lational Data Sources. In Proc. of the 20th International Conference on Database and Expert Systems Applications (DEXA 2009).
2. Witte, R. Kappler, T. And Baker, C.J.O. Ontology design for biomedical text mining. In: Semantic Web: revolutionizing knowledge discovery in the life sciences, Baker, C.J.O., Cheung, H. (eds), Springer: New York, 2007, pp 281-313.
3. Dagobert Soergel, Boris Lauser, Anita Liang, Frehiwot Fisseha, Johannes Keizer and Stephen Katz. Reengineering thesauri for new applications: the AGROVOC example. Journal of Digital Information 4(4) (2004).
4. SKOS Core8, SKOS Core guide9, and the SKOS Core Vocabulary Specifica- tion10. 8 http://www.w3.org/2004/02/skos/core 9 http://www.w3.org/TR/swbp-skos-core-guide 10 http://www.w3.org/TR/swbp-skos-core-spec
Advanced topics in ontology
engineering
Introduction
There are a myriad of advanced topics in ontology engineering, of which most re- quire an understanding of both the logic foundations and of the modelling and engineering, albeit that each subtopic may put more emphasis on one aspect than another. A textbook at an introductory level cannot possibly cover all the spe- cialised subtopics. Those included in this Block III aim to give an impression of the many possible directions with very distinct flavours and interests. They could have been other topics as well, and it was not easy to make a selection. For instance, machine learning is currently popular, and it is being used in ontology engineering, yet has not been included. Likewise, ontology mapping and alignment have a set of theories, methods, and tools drawing from various disciplines and topics that is of interest (graph matching, similarity measures, language technologies). Perhaps readers are interested in learning more about the various applications of ontologies in ontology-driven information systems to be motivated more thanks to demon- strations of some more concrete benefits of ontologies in IT and computing. I do have reasons for including the ones that have been included, though.
Ontology-Based Data Access could be seen as an application scenario of on- tologies, yet it is also intricately linked with ontology engineering due to the rep- resentation limitations to achieve scalability, the sort of automated reasoning one does with it, handling the ABox, and querying ontologies, which can be done but hasn’t been mentioned at all so far. That is, it adds new theory, methods, and tools into an ontology engineer’s ‘knapsack’. It principally provides an answer to the question:
• How can I have a very large ABox in my knowledge base and still have good performance?
The second topic (in Chapter 9) is of an entirely different nature compared to OBDA and brings afore two ontology development issues that have so far been ignored as well:
• What to do if one would want, say, the AWO not in English, or have it in several languages, like name the classIsilwaneorDierrather thanAnimal, and manage it all with several natural languages?
• How can one interact with domain experts in natural language, so that they can provide knowledge and verify that what has been represented in the ontology is what they want to have in there, without them having to learn logic?
That is, there is an interaction between ontologies and natural language, which oftentimes cannot be ignored.
A different topic is the tension between the expressivity of the logic and what one would like to—or need to—represent. Indeed, we have come across the DOL framework (Section 4.3.2), but that does neither cover all possibilities (yet), nor does it make immediately clear how to represent advanced features. For instance, what if the knowledge is not ‘crisp’, i.e. either true or false, but may be true to a
degree? Or if one has used machine learning and induced some x, then that will beprobabilisticallytrue and it may be nicer to represent that uncertainty aspect in the ontology as well. Also, one of the BFO 2.x versions squeezed notions of time in the labels of the object properties (recall Section 6.1.2), but OWL is not temporal, so, logically, those labels have no effect whatsoever. Language extensions to some fragment of OWL and to DLs have been proposed, as there are requests for features to represent such knowledge and reason over it. The main question this strand of research tries to answer is:
• In what way(s) can ontology languages deal with, or be extended with, lan- guage features, such a time and vagueness, so that one also can use those extensions in automated reasoning (cf. workarounds with labels)?
That is, this topic has a tight interaction between modelling something even more precisely—obtaining a better quality ontology—and not just availability of lan- guage features and tinkering with workarounds, but actually getting them.
The final advanced topic looks at scaling up the TBox layer. So far, we have dealt with (very) small ontologies only, but the real ones used in information sys- tems are typically much larger than that: they run into the thousands if not hun- dreds of thousands of classes, which have many more axioms declared in the on- tology. This brings afore questions regarding how to work in the best way with large subject domains and large ontologies. Modularisation is a tried and tested approach, and the main questions that the chapter’s contents contributes to an- swering are:
• What is the landscape of ontology modules, with their aims, types, and char-