availability and Website annotation. Ontology library systems such as Protégé101 or
SHOE102 offer a limited selection of ontologies for download. The ontologies that are
available are generally purpose built, meaning there is often a reusability-usability trade- off problem as described by Klinker et al.(1991). The idea of a single consistent ontology for every domain sounds like an ideal solution, but such a wide ranging all-encompassing approach clearly won’t scale and can’t be enforced. Ontologies usually need to be developed and tailored for individual systems. Development of the AcontoWeb accommodation ontology showed that this can be a relatively complex and time consuming process. Numerous axioms had to be specified in the accommodation ontology to facilitate the types of inference that were required. The time and cost of
101 http://protege.stanford.edu/plugins/owl/owl-library/index.html 102 http://www.cs.umd.edu/projects/plus/SHOE/onts/
ontology development and the need for continuing maintenance can therefore be viewed as likely impediments to wide-scale adoption of the Semantic Web.
On the positive side, in certain commercial applications the potential profit and productivity gain from using well structured coordinated vocabulary specifications will outweigh the sunk costs of developing an ontology and the marginal costs of maintenance (Shadbolt, Hall & Berners-Lee 2006, p. 99). If it is assumed as Shadbolt et al. (2006) have done, that ontology building costs are spread across user communities, the number of ontology engineers required increases as the log of the user community’s size. The amount of building time then increases as the square of the number of engineers, and so the effort involved per user in building ontologies for large communities gets very small very quickly. In many areas the costs will be easy to re-coup. These are reasonable assumptions for a basic model.
Data annotation also remains problematic from a practical perspective. As yet there are few means to routinely and effortlessly generate Semantic Web annotations. The RDF and OWL formats are for machines so Web authors can no longer embed information in plain English. The information needs to be formatted as RDF triples, which are separate from any natural language representations. These formats have seen extremely low adoption rates, thus, there is a real need for representations to be made easier to translate to and from natural language. The AcontoWeb annotation tool proves that this is quite achievable for an individual domain. AcontoWeb accepts user input from accommodation providers and translates it into RDF instance data consistent with the accommodation ontology. The RDF markup is then imbedded into readily extractable comment tags in an HTML file. AcontoWeb demonstrated that this approach works well in a managed portal environment with well defined functionality and limited Web access. It can be said though (i.e. Hepp, 2006), that embedding RDF markup within HTML code violates the one fact in one place paradigm which has contributed so much to data consistency since Codd (1970) introduced it. This potentially causes problems with data inaccuracy if an annotator fails to update the information when the human readable content changes.
More flexible approaches to content creation are required if wide-scale adoption of the Semantic Web is to occur. Human Language Technology (HLT)103 and Latent Semantic
Indexing104 are promising alternatives. These techniques can place data into a semantic
structure using an algorithmic approach. Hepp (2006) states that this raises the obvious question as to whether physical annotation of data needs to occur at all if techniques such as HLT or LSI can apply at query run time. The annotation of dynamic content also remains a problem. Most annotators work for static pages only. A possible solution is to leave RDF metadata in databases and generate dynamic Webpages from it. This is how query results are displayed in AcontoWeb. Here, the results page is dynamically generated from instance data about accommodation resorts stored in a backend database.
6.2.2 Level of Ontology and Annotation Richness that can be Obtained
Knowledge representation is a technique with mathematical roots in the work of Codd (op. cit) in which the theory is to translate information, which humans represent with natural language, into sets of tables that use well defined schema to define what can be entered in rows and columns (McCool 2005, p. 86). The technique led to the creation of the relational database revolution in the 1980’s and also forms the basis of OWL ontologies. The problem with these forms of knowledge representation is that they create a fundamental barrier in terms of richness of representation, as well as creation and maintenance, compared to the written language that people use and HTML incorporates. In the OWL DL AcontoWeb accommodation ontology, cardinality of constraints were unable to be included in the class restrictions for destination classifications without the ontology changing into OWL Full. It was not possible for example, using the OWL DL
language, to say that a backpacker location has a minimum of 3 pubs. The class
restriction relating to pubs could only express the fact that a backpacker location has at least some pubs.
OWL full is more expressive than OWL DL but still suffers from an inability to represent exceptions to rules and the contexts in which they are valid. Depending on the level of expressiveness required, there can be a need for more powerful languages other than RDF
103 http://www.mitre.org/work/ird_human_language.html 104 http://www.cs.utk.edu/~lsi/
and OWL. SWRL105 is one such language that builds on OWL. The more expressive
markup languages like SWRL allow developers to write application-specific declarative knowledge, and can improve the ontology and annotation richness of information on the Semantic Web.