• No se han encontrado resultados

Innovación con tecnología en larga escala

In short, the primary goal of implementation and attempts towards optimisation described in this thesis can be considered as successful. However, development of the implementation and the experiments was very time consuming, and there was insufficient time to implement all potential ideas. Many useful features have yet to be implemented, such as:

• Extending the implemented ∆-query language to make it more user friendly with quantification over multiple variables. Also, similarly for the case of collection, separation and recursion constructs.

• Improving the library function, in particular to allow multiple or user defined libraries.

• Extending the implemented ∆-query language to include path expressions which are typically included in other approaches towards semi-structured databases and, additionally, are very useful practically. In principle, path expressions could be implemented by rewriting them into ∆-queries according to definitions in [61]. But, straightforward implementation should be more efficient.

• Extending the implemented∆-query language by update queries.

• More user friendly interfacefor inputting queries and WDB, as well as for outputting query results. In particular, the graphical visualisation of WDB and query results (developing a special WDB browser, as well as an editor for WDB files).

Additionally, suitable techniques should be developed for creating WDB, taking into account its hyperset theoretic character:

• Using WDB schemas in the context of hyperset approach to impose restriction on the structure of WDB, just like in the relational approach but not necessarily so rigid. In fact, enforcing structure makes queries easier to write, and, additionally, can serve to eliminate possible unintended redundancies in set equations which could arise otherwise due to poor WDB design.

Furthermore, although some suggestions towards efficiency were made here, there remains much work towards development of a practically efficient implementation:

• Adapting known and developing new optimisation techniques such as indexing, hashing and other data structures helping to implement efficient searching as described in [73] to the case of semi-structured data. Redundancies in set equations arising during computation should be regularly eliminated, thus allowing writing queries without explicit using the canonisation query. In this case equality between sets trivially becomes the identity relation rather than the bisimulation relation. Also, identical query calls should be executed only once.

• Dealing with redundanciesin various circumstances by developing various techniques and methodology e.g. related with redundancies (bisimilarities) arising due to local updates in a WDB file (answering questions such as: are redundancies possibly arising in such local way easy to eliminate? under which conditions? etc.), or due to mirroring WDB sites, etc.

• Further improvements on the bisimulation enginetransforming it from imitational to a more realistic version (Web service) assuming several levels (granularity) of locality (WDB-files, WDB-sites, the whole WDB) and extending the range of experiments with this engine.

• Adopting known [24, 25] and developing new techniques for optimisation of bisimulationwhich, for example, may take advantage of WDB scheme (see above).

There is great scope for further theoretical and practical work. In summary, this could mean developing a full-fledged WDB management system and also WDB design techniques, and other methodologies based on the hypeset approach. Of course, the hyperset approach could be further evolved, e.g. it can be extended to also involve standard datatypes like integers, reals, strings as atomic data or label values with arithmetical and other operations over them (completely lacking in the current version of ∆), etc. Also, multi-hypersets [44], records, lists, etc. could be allowed. Another version of the∆language capturing LogSpace [40, 42] (currently for well-founded sets only) could be either implemented in its present form or, firstly, theoretically extended to the case of hypersets. Anyway, working on the theoretical level in various directions and simultaneously developing more practically oriented implementations, like in this thesis, seems a fruitful style of research.

Appendix

A.1

Implemented BNF grammar of

∆-query language

The grammar of the implemented ∆-language is represented by the metasyntax notation Extended Backus-Naur Form (EBNF) which allows for example to define the repetition of syntactical categories using*or+(unlike regular BNF which does not have these features).

For example, the EBNF production rule of <declarations> in Section A.1 defines an infinite number of possible forks, with any number of leaves labelled by<declaration>

each separated by the terminal leaf labelled by",".

The EBNF notation (used here to express the∆-language grammar) defines production rules as sequence of terminals (symbols) or non-terminals,

"xxx" - Terminal

<yyy> - Non-terminal

where production rules are constructed (from those terminals or non-terminals) according to the following rules,

Parentheses,() - Grouping Vertical bar,| - Alternation Square brackets,[] - Optional

Kleene star,* - Repeat 0 or more times

Kleene plus,+ - Repeat 1 or more times

Top level commands

<top level command> ::=

( "library" <library command> | <query> | "exit" ) ";" 169

<query> ::=

"boolean query" <delta-formula> | "set query" <delta-term>

Library commands

<library command> ::= "add" <declarations> | "list" [ "verbose" ]

Declarations

<declarations> ::= <declaration> ( "," <declaration> )* <declaration> ::=

<set constant declaration> | <label constant declaration> | <set query declaration> | <boolean query declaration>

<set constant declaration> ::=

"set constant" <set constant> ("be"|"=") <delta-term>

<label constant declaration> ::=

"label constant" <label constant> ("be"|"=") <label value>

<set query declaration> ::=

"set query" <set query name> "(" <variables> ")" ("be"|"=") <delta-term>

<boolean query declaration> ::=

"boolean query" <boolean query name> "(" <variables> ")" ("be"|"=") <delta-formula>

<variables> ::= <variable> ( "," <variable> )*

<variable> ::= ( "set" <set variable> | "label" <label variable> )

<parameters> ::= <parameter> ( "," <parameter> )* <parameter> ::= ( <delta-term> | <label> )

<boolean query name> ::= <identifier> <set query name> ::= <identifier>

∆-terms

<delta-term> ::= <set variable> | <set constant> | <set name> | <atomic value> | <enumerate> | <union> | "(" <multiple union> ")" | <collect> | <separate> | <transitive closure> | <recursion> | <decoration> | <if-else term> | <set query call> |

<delta-term with declarations>

<set name> ::= <URI> "#" <simple set name>

<atomic value> ::= """ <identifier> """

<enumerate> ::= "{" <labelled terms> "}"

<union> ::= ( "U" | "union" ) <delta-term>

<multiple union> ::=

<delta-term> ( ( "U" | "union" ) <delta-term> )* <collect> ::=

"collect" "{" <labelled term> ( "where" | "|" ) <variable pair> ("in"|"<-") <delta-term> [ "and" <delta-formula> ] "}"

<separate> ::=

"separate" "{" <variable pair> ("in"|"<-") <delta-term> ( "where" | "|" ) <delta-formula> "}"

<transitive closure> ::=

( "tc" | "TC" | "transitiveclosure" ) <delta-term>

<recursion> ::=

"recursion " <set variable> " {" <variable pair> (" in "| "<-") <delta-term> ( "where" | "|" ) <delta-formula> "}"

<decoration> ::= "decorate" "(" <delta-term> ", " <delta-term> ")"

<if-else term> ::= "if" <delta-formula> "then" <delta-term> "else" <delta-term> "fi"

<set query call> ::= "call" <set query name> "(" <parameters> ")"

<delta-term with declarations> ::=

"let " <declarations> "in" <delta-term> " endlet"

<URI> ::= ( <web prefix> | <local prefix> ) <file path> <web prefix> ::= "http://" <host> "/" [ "˜" <identifier> "/" ] <local prefix> ::= "file://" ( (A-Z) | (a-z) ) ":/"

<host> ::= <identifier> [ "." <host> ]

<file path> ::= <identifier> ( "/" <file path> | <extension> ) <extension> ::= ".xml"

<simple set name> ::= <identifier>

∆-formulas

<delta-formula> ::= <atomic formula> | "(" <conjunction> ")" | "(" <disjunction> ")" | "(" <quasi-implication> ")" | <quantified formula> | <negated formula> | <if-else formula> |

<delta-formula with declarations>

<atomic formula> ::=

<equality> | <label relationship> | <membership> | <boolean query call> | "true" | "false"

<equality> ::= <set equality> | <label equality>

<set equality> ::= <delta-term> "=" <delta-term>

<label equality> ::=

<wildcard label> ::=

["*"] ( <label variable> | <label constant> ) ["*"] | "’" ["*"] <identifier> ["*"] "’"

<label relationship> ::= <label> "<" <label> <label> ">" <label> <label> "<=" <label> <label> ">=" <label>

<membership> ::= <labelled term> ("in"|"<-") <delta-term>

<boolean query call> ::= "call" <boolean query name> "(" <parameters> ")"

<if-else formula> ::= "if" <delta-formula> "then" <delta-formula> "else" <delta-formula> "fi"

<delta-formula with declarations> ::=

"let" <declarations> "in" <delta-formula> "endlet"

<conjunction> ::= <delta-formula> ( "and" <delta-formula> )*

<disjunction> ::= <delta-formula> ( "or" <delta-formula> )*

<quasi-implication> ::= <delta-formula>

( <quasi-implication connective> <delta-formula> )* <quasi-implication connective> ::=

"<=" | "=>" | "implies" | "iff" | "<=>"

<quantified formula> ::= <forall> <delta-formula> | <exists> <delta-formula> |

<forall> ::=

"forall" <variable pair> ("in"|"<-") <delta-term> [ "." ]

<exists> ::=

"exists" <variable pair> ("in"|"<-") <delta-term> [ "." ]

Variables, constants, literals etc.

<label> ::= <label variable> | <label value> | <label constant> <label variable> ::= <identifier>

<label constant> ::= <identifier>

<label value> ::= "’" <identifier> "’"

<set variable> ::= <identifier> <set constant> ::= <identifier>

<labelled terms> ::= <labelled term> ( "," <labelled term> )* <labelled term> ::= <label> ":" <delta-term>

<variable pair> ::= <variable pair label> ":" <variable pair term> <variable pair label> ::= <label variable> | <label value>

<variable pair term> ::= <set variable>