• No se han encontrado resultados

2. Objetivos

4.3 Acerca de las relaciones con la naturaleza

Digital signatures provide a convenient, yet powerful way to verify the integrity of a message (see Section 2.2.8.1). Standards such as XML Digital Signature de- scribe how to sign XML-based documents as well as arbitrary binary objects in an efficient manner. The Semantic Web, while based on WWW standards, provides a fundamentally different semantic (Description Logic) and syntactic model (RDF graphs) that makes digital signatures more of a challenge. Integrity verification Tim Berners-Lee has argued for some time that digital signatures form part of the solution to trust on the Semantic Web45.

We can identify three challenges that need to be overcome before digital signatures on the Semantic Web become reality: RDF canonicalisation, semantic interoper- ability, signature serialisation.

3.4.1.1 Canonicalisation Issues

Unlike XML, RDF does have a canonical form. Canonical XML is characterised by the XML Information Set [Cowan and Tobin(2004)] which attempts to guarantee that logically identical XML documents produce identical serialised representa- tions. While Gutmann (2004) argues that Canonical XML is fundamentally bro- ken, XML Digital Signature has successfully used as the basis for various security specifications including WS-Security and SAML [Cantor et al. (2005)].

Before digital signatures in RDF can be realised, it is vital that some form of canonicalisation (C14N) is achieved. Cloran and Irwin (2005) argue that canon- ical RDF can be broken down into two categories: canonicalisation of the RDF model and canonicalisation of a serialised RDF model. We will see later (Sec- tion 3.4.1.3) why using a canonical serialisation of the RDF model is not an ideal approach. To our knowledge, at least two algorithms exist for creating canon- ical RDF models [Carroll (2003); Sayers and Karp (2003)], with only Carroll’s algorithm having an implementation in the public domain.

Blank Nodes Blank nodes as defined by the RDF Recommendation [Klyne and Carroll (2004)] are used to label resources not described by a URI. Figure3.3

shows a fully labelled RDF graph that contains no blank nodes. If we wanted to

<urn:uuid:CA2CAF30-21A8-11DB-8270-9859210973A2> { <https://localhost:8443/JSPWiki/Wiki.jsp?page=

org.embl.ebi.escience.scuflui.workbench.Workbench> a dp:Wikipage ;

dp:content "description content" ;

dp:firstVersion <https://localhost:8443/webdav/taverna/ taverna/org/embl/ebi/escience/scuflui/workbench/Workbench/ 1/1/Workbench.java> ;

dcterms:created "Tue Aug 01 22:57:55 BST 2006"^^ <http://www.w3.org/2001/XMLSchema#dateTime> ; }

Figure 3.3: A Fully Labelled RDF Graph

digitally sign this graph, we would canonicalise it according to Carroll’s algorithm, which would trivially reorder all triples preceded by the graph name.

Figure 3.4 illustrates a more complex example where not all triples in the graph are fully labelled, encapsulated in square brackets. This example happens to represent an RDF collection. The difficulty in this case is when a triple’s subject and object are both blank nodes; if several such triples exist then they can become indistinguishable from one another and therefore need to be altered if the graph is to be suitably reordered for signing.

Carroll’s solution to this problem is to actually modify the graph withmeaningless changes, defining a special property c14n:true which is always true; this means triples with this predicate can be added and subtracted from the graph without changing its meaning according to RDF Semantics [Hayes (2004)]. This means the RDF graphs digitally signed is different from the original. While Figure 3.4

can be reliably canonicalised (see Appendix C.1.2), the more blank nodes in the graph, the more likely it is for Carroll’s algorithm to fail.

If we want to find an extreme example where Carroll’s algorithm really does fail, we should consider a complex graph such as the Petersen Graph [Holton and Sheehan

(1993)]. Figure3.5 shows one graphical representation of the Petersen graph with its ten nodes and fifteen edges (An example TriG serialisation can be found in AppendixC.1.2).

Since the Petersen graph, like an RDF graph with only blank nodes, can have many different representations based on its labelling, it can be extremely difficult to determine if two graphs are identical (isomorphic).

<urn:uuid:E192F360-226F-11DB-94B3-E05EDA46CF20> { wn20schema:NounWordSense rdfs:domain [ a owl:Class ; owl:unionOf [ rdf:first wn20schema:AdjectiveWordSense ; rdf:rest [ rdf:first wn20schema:VerbWordSense ; rdf:rest () ; ] ; ] ] ; }

Figure 3.4: Partially Labelled RDF Graph

McKay (1981) describes a more robust algorithm that solves the graph isomor- phism problem [K¨obler et al. (1993)] and can therefore cope with blank nodes and reliably relabel the Petersen graph (Figure 3.5)46. McKay’s algorithm has made an implementation available in the nauty distribution47. While this algorithm satisfactorily creates a canonical representation of an arbitrary graph, unlike the Carroll and Sayer algorithms, it has non-polynomial complexity [Miyazaki(1997)], which makes it less favourable in a Semantic Web environment.

Figure 3.5: A Petersen Graph

46See AppendixC.1.2for further details.

3.4.1.2 Semantic Issues

While DBin’s RDF digital signature solution provides a starting point for future implementations, its reliance on RDF Reification as the signature attachment mechanism is problematic. One major problem is that treating a digital signature as a reified statement only applies to that statement, not the graph itself. As we also noted in Section 3.3.2.2, there are also semantic problems.

Because reified triples are not part of the knowledge-base, they are not part of the underlying logic. If we consider an OWL DL knowledge-base with a number of reified digital signatures, basic DL subsumption is not possible over any reified statement. It is also true that any custom GCI axioms or Horn-clause rules would not be able to operate over reified statements. Even Semantic Web toolkits such as Jena 248 have to provide a specialised API to access reifications.

3.4.1.3 Serialisation Issues

Dunbill49andCloran and Irwin (2005) both suggest that canonical serialised RDF

can be used as the basis for RDF digital signatures. Dunbill’s FOAF signatures use PGP, whileCloran and Irwin (2005) take a more interoperable approach with XML Digital Signature.

Signing serialised RDF has the obvious benefit in that it avoids the various canon- ical RDF issues mentioned earlier. RDF documents and their detached signa- tures (PGP and XML Digital Signature) can be stored on a personal website and verified at a later date. On the other hand, storing an RDF document and its signature in a triple store would yield a different serialisation at verification time, and would thus invalidate the signature.