3. Análisis empírico de los textos
3.8. Conclusiones del análisis cuantitativo-cualitativo
RDF models make use of classes and properties which meaning is specified in shared vocabularies. Fixing the semantics of RDF terms is important to support a consistent usage between datasets. An equally important aspect is related to the support for automated reasoning. By the means of RDFS schema - and other languages built on top of it - computer programs are capable of producing inferredknowledge from the triples asserted in consumed RDF datasets.
The basic inferences can be derived from RDFS entailments. RDFS entailment can be expressed as Horn clauses:
subClassOf(x,z) ← subClassOf(x,y) ∧ subClassOf(y,z)
subPropertyOf(a,c) ← subPropertyOf(a,b) ∧ subPropertyOf(b,c) .
type(z,y) ← domain(x,y), triple(z,x,q) .
type(q,y) ← range(x,y), triple(z,x,q) .
This reasoning can be applied to any RDF datasets that include some schema specification in RDFS. An RDFS reasoner will be able to derive that any foaf:Person is also a foaf:Agent and that if a resource has a foaf:familyName, then it must be a foaf:Person. Particularly, it is worth noting that rdfs:subClassOf and rdfs:subPropertyOf are both transitive property, meaning that an RDF reasoner will materialise the complete set of rdf:type statements up to the top class. We will make extensive use of RDFS entailment in our approach. However, RDFS entailments are only one possible method to derive inferences from data. The Semantic Web community developed a set of languages to enhance RDF with inferred knowledge. For example, the W3C developed the Web Ontology Language (OWL), the Semantic Web Rule Language (SWRL), the Rule Interchange Format (RIF) and the recent Shapes Constraint Language (SHACL). Here we focus on two technologies, namely the Web Ontology Language (OWL) and the SPARQL Inferencing Notation (SPIN)9. The first is a W3C standard grounded in the tradition of Description Logics, having well studied computational properties and a wide range of features allowing to develop full-fledged ontologies for the Semantic Web. The second is a technology and a syntax
2.4. SEMANTIC WEB TECHNOLOGIES 47 to express and execute rules using the SPARQL language syntax, initially developed by Top Quadrant10.
The Web Ontology Language (OWL)
The motivation behind the development of OWL stands from the requirements of providing schema definitions with larger expressivity than the ones possible within RDFS. RDFS is indeed limited to subsumption relations for class and property hierarchies and the definition of properties’ domain and range. The foundation of the Web Ontology Language (OWL) can be found in description logics, and therefore in first-order logic (FOL). Indeed OWL constructs are mostly based on quantifiers, allowing to make statements such as "any person has a name":
foaf:Person a owl:Class ; rdfs:subClassOf [ rdf:type owl:Restriction ; owl:onProperty foaf:name ; owl:someValuesFrom rdfs:Literal ] .
In what follows we introduce the basic elements of the language, taking as reference the OWL2 specification. We limit the description to the features that are used in the next chapters, the reader is referred to [W3C OWL Working Group (2012)] for further details.
OWL is a formalism to develop ontologies, and on itself, it is specified as an independent language from RDF. However, OWL has an RDF semantics that can be considered to be an extension of RDFS, as it includes RDFS entailments. For simplicity, in what follows we will use a functional-style syntax to describe OWL language constructs that is less verbose than its RDF/Turtle counterpart.
OWL ontologies include four type of statements: (a) the ontology declaration, stating that the document is an ontology and providing an identifier for it; (b) import statements, permitting to include the content of another document as an integral part of the current; (c) annotations, which are non logical statements used to document the various elements of the ontology; and (d) axioms, which constitute the logical part of the ontology, and are the ones we are going to look into some detail in what follows.
Class axioms. Class axioms include class equivalence, disjointness as well as constraints on the properties that entities of a given class can have. In OWL, entities are called individuals. The three
sets of classes, properties, and individuals are mutually disjoint. Classes can be named (C) or anonymous, also called class expressions (CE).
– SubClassOf (CE1, CE2) the subsumption relation (inherited from RDFS in the OWL/RDF semantics).
– EquivalentClasses(CE1...CEn) the classes are equivalent, meaning any member of each
one of them is also a member of the others.
– DisjointClasses(CE1...CEn) individuals cannot belong to both classes.
– DisjointU nion(C, CE1...CEn) a class is the disjoint union of a number of other classes.
Class expressions (CE) need to be part of axioms, but can be described separately, for example: – ObjectSomeV aluesF rom(OP E, CE) the class of individuals having as value of the prop-
erty OP E at least one individual belonging to the class defined by CE.
– ObjectAllV aluesF rom(OP E, CE) the class of individuals having as value of the property OP E only individuals belonging to the class defined by CE.
– ObjectM inCardinality(n, OP E) the class of individuals having at least n individuals as target of the property OP E.
– DataM axCardinality(n, OP E) the class of individuals having at most n values for the property OP E.
– DataExactCardinality(n, OP E) the class of individuals having exactly n values for the property OP E.
Class expressions are a useful concept as they allow to declare complex constraints by the means of anonymous classes. Here an example in RDF/Turtle syntax:
foaf:Person a owl:Class ; rdfs:subClassOf [ rdf:type owl:Restriction ; owl:onProperty foaf:name ; owl:someValuesFrom rdfs:Literal ] .
Property axioms. RDFS properties can only be defined in terms of domain and range. The main distinction is the one between object properties and data proper- ties. OWL object properties (owl:ObjectProperty) have as range resources, while datatype properties (owl:DatatypeProperty) have literals. (A third property type is owl:AnnotationProperty, but they are not meant to produce inferences.) In practice, declar- ing a property an owl:ObjectProperty is equivalent to declaring its rdfs:range to be
2.4. SEMANTIC WEB TECHNOLOGIES 49 owl:Thing(the class of all individuals). OWL includes a wide range of property features, for example the capability of referring to the inverse of an object property:
– ObjectInverseOf (OP )
Object properties can be of several types. In the following list, object property expressions OP E are meant to be a named object property or an anonymous property being the inverse of a named one:
– SubObjectP ropertyOf (OP E1, OP E2) the subsumption relation (inherited from RDFS in the OWL/RDF semantics).
– EquivalentObjectP roperties(OP E1...OP En) two properties are equivalent.
– DisjointObjectP roperties(OP E1...OP En) these relations cannot be shared between the
same subject and object (for example, parentOf, husband and childOf are three disjoint relations).
– InverseObjectP roperties(OP E1, OP E2) the two properties are one the inverse of the other.
– F unctionalObjectP roperty(OP E) the object property is functional, meaning that for a given subject only one object is possible. The effect is that two resources being both the object of a functional property on the same subject will be interpreted as referring to the same entity. hasBiologicalMother is functional, as one cannot have two different biological mothers (if one have two, they must be the same person...).
– InverseF unctionalObjectP roperty(OP E) given a certain property object only one sub- ject is possible. It is the case of the property biologicalMotherOf.
– Ref lexiveObjectP roperty(OP E) A reflexive object property implies that any individual is connected by OP E to itself. The relation knows is an example of the sort.
– Irref lexiveObjectP roperty(OP E) On the contrary, an irreflexive object property disal- lows individuals to be connected to themselves by OP E. The relation parentOf is irreflexive. – SymmetricObjectP roperty(OP E) A symmetric relation implies that if one individual is connected by OP E to another, then the second is also connected to the first by the same OP E. Being friends is such a relation.
– AsymmetricObjectP roperty(OP E) Asymmetry implies that the above case is impossible. Again, parentOf and childOf are both asymmetric.
– T ransitiveObjectP roperty(OP E) Transitivity implies that if individual x is connected to y by OP E, and that y is connected to z by OP E, then x is connected to z by OP E.
– SameIndividual(a1...an) this is the common owl:sameAs property used in the Linked Data.
– Dif f erentIndividuals(a1...an) can be used to declare that a set of individuals are all different. As a result, deriving SameIndividual(a1, a2) would raise an inconsistency. OWL2 includes more then what has been introduced so far, for example punning and prop- erty chains. However, we refer to the W3C specifications for a complete overview of the OWL language [W3C OWL Working Group (2012)].
A good property of the OWL language is that its design has been conducted on one hand considering its computational properties, on the other hand to allow for a rich expressivity. By doing so, the specification delivers a set of OWL profiles, having different computational properties. In fact, OWL2 profiles can be considered sub-languages (syntactic subsets) offer significant advantages in particular application scenarios.
– OWL 2 EL only supports a subset of the possible class restrictions (excluding, for example, cardinality restrictions) and excludes a wide range of property types (including functional and inverse functional properties). It is recommended for large knowledge bases and guaran- tees that instance checking, class subsumption and ontology consistency can be decided in polynomial time.
– OWL 2 QL guarantees sound and complete query answering with logarithmic complexity in proportion of the size of the data (assertions). This profile includes most of the features of description logic languages. It restricts the language not only in terms of constructs to be used (for example owl:sameAs as well as any cardinality restrictions) but also in places where some can be used.
– OWL 2 RL supports all axioms of OWL 2 with the exception of the disjoint union of classes and reflexivity of object properties.
SPARQL Inferencing Notation (SPIN)
SPARQL can be used to write rules as the ones described from RDFS entailments using CON- STRUCTqueries. However, a rule engine would need to execute the query over the data iteratively until no more new triples can be derived. The SPARQL Inference Notation allows sophisticated methods to define constraints and rules using the SPARQL syntax and to attach them to specific elements of the dataset.
In a nutshell, SPIN:
2.5. DATA CATALOGUING ON THE WEB 51