• No se han encontrado resultados

D. LOCUTORES MASCULINOS E INTERVALOS UNVOICED

8.1. Resultados Finales

In this chapter we will investigate the impact of the list constructor on the perhaps most common class of relational dependencies. Functional dependencies are introduced in the context of null, flat, record-valued and list-valued attributes. The goal of this chapter is to extend the solution to such problems as axiomatisation, implication, normalisation and its j ustification and decomposition algorithms for the class of FDs from the RDM to the presence of lists.

It turns out that Armstrong's original axioms, when generalised to the presence of lists, are still sound. We will prove that these generalised Armstrong's axioms are also complete. Based on this axiomatisation we investigate the implication problem for the class of FDs in the presence of lists. Therefore, an alternative view of FDs is proposed first, which is based on the representation theorem for Brouwerian algebras. The main result is a linear time, provably-correct algorithm for deciding implication.

Finally, we turn to normalisation issues in the presence of lists. The Nested List normal form (NLNF) is proposed as a normal form for nested attributes and the class of FDs in the presence of lists. NLNF is strictly weaker than a simple extension of Boyce-Codd normal form. The key reason is an argument that non-maximal join-irreducible subattributes do not cause redundancies. NLNF is characterised and thereby j ustified in many ways, for instance by the absence of redundancies, simpler integrity checking and the absence of many forms of update anomalies. The results generalise well-known results from the relational data model, thus giving a formal semantic justification for the normal form proposal.

To conclude, a lossless decomposition of nested attributes into subattributes in NLNF is considered. A provably-correct algorithm is proposed that works, in general, in exponential time in the size of the underlying nested attribute and the number of FDs given.

Some results of this chapter can be found in the literature. The axiomatisation of FDs appears in [146] , and the Nested List normal form proposal and its semantic justification are published in [143] .

3. 1 . AXIOMATISATION Sebastian Link 3 . 1 Axiomatisation

We define FDs, introduce a generalisation of Armstrong's axioms and prove that these rules are sound and complete for the implication of FDs in the presence of lists.

3 . 1 . 1 Definition of FDs

Given that a nested attribute N generalises the notion of a relation schema, the subat­ tribute relationship extends the subset relationship and the projection function subsumes the restriction of tuples, we obtain the following definition.

Definition 3 . 1 . Let N

E N

A be a nested attribute. A functional dependency on N is an expression of the form X ---+ Y where X, Y

E

Sub(N) . A set r � dom(N) satisfies the

functional dependency X ---+ Y on N, denoted by Fr X ---+ Y, if and only if 1rf (t1 ) = 1rf (t2)

whenever

7r

� (t1 )

= 7r

� (t2) for any t 1 , t2 E r holds. 0

Given the relation schema R =

{

A, B, C} the FD A ---+ B on R can b e read as the

FD R(A, ).,

.X)

---+ R(

.X

, B,

.X)

on the record-valued attribute R(A, B, C) . We consider the

following examples to become more acquainted with FDs defined on nested attributes. EXAMPLE 3 . 1 . Suppose the nested attribute Pubcrawl(Person,Visit[Drink(Beer,Pub)]) is used to store sequences of beers consumed by a person together with the pub in which the person had that beer. A snapshot r of such a database may look as follows:

{

(Sven, [(Liibzer, Deanos) , (Kindl, Highflyers)]), (Sven, [(Kindl, Deanos) , (Liibzer, Highflyers)] ) ,

(Klaus-Dieter, [(Guiness, Irish Pub), (Speights, 3Bar) , (Guiness, Irish Pub)]) , (Klaus-Dieter, [(Kolsch, Irish Pub) , (Bonnsch, 3Bar) , (Guiness, Irish Pub) ] ) , (Sebastian, [ ]) }

It is then obvious that Fr Pubcrawl (Person) ---+ Pubcrawl(Visit[Drink(Pub)] ) holds. This

means that in this particular snapshot, the same person always visits the same pubs in the

same order. 0

EXAMPLE 3 . 2 . Suppose we have a database for storing the prime factorisation of positive

integers n. The nested attribute

Factor(Integer,Prime[Number] ,Exponent[Number])

would be suitable where Prime[Number] is the list of different prime factors of n in in­ creasing order, and Exponent[Number] the list of exponents for each corresponding prime factor in Prime[Number] . A small snapshot of the database could be

{

(12,[2,3] , [2,1]), (35, [5, 7] , [1 ' 1]) ' (37, [37] , [1]) ,

The fundamental theorem of number theory states that every positive integer has a unique prime factorisation. This means that the nested attribute Fac­ tor(Integer,Prime[Number] ,Exponent[Number] ) carries the following semantic information in terms of functional dependencies:

- Factor(Integer) -+ Factor(Prime[Number] , Exponent[Number]) ,

- Factor(Prime[Number] , Exponent [Number] ) -+ Factor(Integer) .

Further examples of FDs which every snapshot of this database satisfies are - Factor(Prime[.A] ) -+ Factor(Exponent[.A]) and

- Factor(Exponent[.A]) -+ Factor(Prime[.A] ) .

Informally, they state that the number of different prime factors determines the number of

exponents, and vice versa. 0

EXAMPLE 3 . 3 . Consider the nested attribute N for the GenBank in Example 2.2. The set E of FDs that were informally described in Section 1 .2.2 can now be specified formally as follows:

1. DNA(Origin[Base] ) -+ DNA(Count(A,C,G ,T) ) , 2. DNA(Count(A,C,G,T)) -+ DNA(Origin[-A] ) ,

3 . DNA(Origin[.A] ,Count (A,C,G)) -+ DNA(Count(T) ) , DNA(Origin[-A] ,Count(A,C,T)) -+ DNA(Count(G ) ) , DNA(Origin[-A] ,Count(A,G,T) ) -+ D A(Count(C) ) ,

DNA(Origin[-A] ,Count(C,G,T)) -+ DNA(Count(A) ) ,

4 . DNA(Origin[Base] ,Gene(Start,End) ) -+ DNA(Gene(Sub[Nucleo] ) ) ,

5 . DNA(Gene(Sub[Nucleo] ) ) -+ DNA(Gene(Translation[Amino] ) ) , 6. D A(Gene(Sub[.A]) ) -+ D A(Gene(Translation[-A] ) ) , DNA(Gene(Translation[-A] )) -+ DNA(Gene(Sub[.A] ) ) 7. DNA(Gene(Start,Sub[.A] )) -+ DNA(Gene(End)) , 8. DNA(Gene(End,Sub[.A] ) ) -+ DNA(Gene(Start) ) , 9 . DNA(Gene(Start,End)) -+ DNA(Gene(Sub[.A] ) ) .

3 . 1 .2 Implication and Derivation

0

We will use this section to introduce the notions of semantic implication and syntactical derivation for classes of data dependencies (and with respect to a given set of inference rules) . In what follows, C denotes a certain class of dependencies, for example functional dependencies in the presence of null, fiat, record- and list-valued attributes.

Definition 3.2. Let N be some nested attribute, E be a finite set of dependencies in C whose elements are all defined on N, and 1 a dependency of class C on N. We say that 1 is implied by E (E implies 1, or 1 follows from E), denoted by E f= 1, if and only if every r � dom(N) satisfying all a E E also satisfies 7 . We say that 1 is implied by E (E implies 1, or 1 follows from E) in the finite sense, denoted by E FJ 1, if and only if every finite r � dom(N) satisfying all a E E also satisfies 1. The (finite) semantic hull Ec (Efin,c) of

3 . 1 . AXIOMATISATION Sebastian Link

E in C on N is the set of all dependencies T in C on N implied by E (in the finite sense) , i.e. Ec

=

{ T E C I E F= T} (Efin,c

=

{T E C I E F=1 T} ) . o

In order to capture the semantic notion of (finite) implication syntactically, one is interested in the notion of inference using certain inference rules.

Definition 3.3. An inference rule consists of a finite set 1-lJ

= { cp1,

... , 'Pn} of parame­ terised dependencies, another non-empty, finite set <!:

= {

?/;1 , . . . , 1/Jm} of parameterised dependencies and a finite set Con =

{

C1, ... , Ck} of constraints on the parameters in 1-lJ and <!:. The 'Pi ( i

=

1 , . . . , n) are called the premises of the rule; the 1/Ji ( i

=

1 , . . . , m) are

called the conclusions of the rule. An inference rule with no premises (P

= 0)

is called an

axiom. The notation

'Pl , · · · ' 'Pn

1/J 1 , . . · ' 1/J m c1 , . . . ,ck

is used to denote inference rules. D

I

Given some inference rule

1' ...

C1, . .. , Ck the intention is to formalise derivation. · . . ,

Whenever we have dependencies r.p1 , . . . 'Pn arising from the premises r.p� , . . . , r.p� by su bsti-

tuting the parameters, then we can derive the dependencies ?/;1 , .. . , 'l/Jm which result from the conclusions ?/;� , . . . ' 1/J'm by the same substitution provided all the conditions cl , . . . ' ck

are satisfied. In such a case we speak of an instantiation · · · of this inference rule. 1 , . . ·, m

Definition 3.4. Let 9\ be a set of inference rules and let E be a set of dependencies in C

whose elements are all defined on the nested attribute N. A derivation tree over 9\ and E

is a directed tree satisfying the following conditions:

- each node in the tree has an attached dependency in C that is defined on N;

- whenever a node with attached dependency ?/; has successor nodes with attached de-

pendencies r.p1 , . . . , 'Pn, then

'PI, · · · , 'Pn

• either there exists an instantiation ?/;1 , . . . , 1/Jm of a rule

9\ such that 1/Ji

=

?/; holds for some i

• or the node is a leaf and ?/; E E holds. D

Examples of derivation trees can be found in the proof of Lemma 3. 1 1 on page 58. Please note that these derivation trees really "grow" from bottom to top. Instead of drawing edges between the parent node and every of its successors, we will draw one single horizontal line between all the successor nodes and the parent node.

We are now prepared to define the derivation of dependencies from a given set E of dependencies using a particular underlying set of inference rules.

Definition 3.5. Let 91: be a set of inference rules and let E be a set of dependencies in C whose elements are defined on the nested attribute N. A dependency

T

is derivable from

E using 91: if and only if there exists a derivation tree over 91: and E with root

T

(notation:

E f-91

T) .

The syntactic hull E� of E under 91: is the set of all dependencies that are derivable from E using 91:, i.e.,

E; = {T I E

f-91

T} .

0 One is interested in meaningful sets 91: of inference rules for deriving dependencies. That is, every dependency that is derivable from E using 91: should also be implied by E. In order to capture the semantic notion of implication by the syntactical notion of inference the set 91: must have a further property. Every dependency implied by E must also be derivable from E by using only inference rules from 91:.

Definition 3.6. A set 91: of inference rules is called sound for the (finite) implication of dependencies in C if and only if for every nested attribute N and for every set E of dependencies in C on N we have E; � Ec ( E� � Efin,c) . A set 91: of inference rules is called complete for the (finite) implication of dependencies in C if and only if for every nested attribute N and for every set E of dependencies in C on N we have Ec � E� (Efin,c � E;) . The class C is called (finitely) axiomatisable if and only if there is a (finite) sound and complete set of inference rules for the implication of dependencies in C. 0

Sound and complete sets of inference rules for the implication of dependencies in C are sometimes called C-sound or C-complete. A further interesting question deals with the independence of inference rules.

Definition 3. 7. Let 91: denote some set of inference rules. An inference rule R is C­

independent from 91: if and only if there is a nested attribute N and a set E of dependencies

in C on N as well as some dependency a with a

E� but a E E;u{R} . A C-sound and C-complete set 91: of inference rules is called minimal for the implication of dependencies in C if and only if every R E 91: is C-independent from 91: - {R} , i.e., there is no 91:' c 91:

which is C-complete as well. 0

Strictly speaking, the notion of minimality should also take the set Con of constraints of every inference rule into consideration. It may well be that all the rules are independent from one another but some constraint in C on can still be weakened.

EXAMPLE 3 . 4 . The following set of inference rules is sound and complete for the implica­ tion of FDs in the RDM. X ---+ y y � X, X -+ Y XW ---+ YV V � W, X --+ Y, Y -+ Z X -+ Z

None of the rules can be omitted without losing completeness. However, the reflexivity axiom X ---+ y Y � X can be replaced by the weaker axiom 0 ---+ 0 as the second inference

3. 1 . AXIOMATISATION Sebastian Link However, in this thesis we will focus on the notion of minimality as introduced in Definition 3. 7, and study the stricter version in the future.

Usually, once a class C has been fixed, the index C is dropped from Et, L'c and L'fin c ·

,

In this chapter, we will consider the class C of functional dependencies in the presence of

null, flat, record- and list-valued attributes. The index m. is dropped from L'� once the set of inference rules has been fixed.

3 . 1 .3 The generalised Armstrong Axioms

Let L' be a set of FDs, and CJ an FD, all defined on some nested attribute N. Real life databases are inherently finite. Therefore, our attention should be firstly directed towards the finite implication problem L' FJ CJ. However, in the case of FDs the finite implication problem coincides with the unrestricted implication problem L' f= CJ. Interpreting F(J) as relations, it is immediate that f= � f= 1 holds. If there is an infinite r � dom(N) with Fr E

and �r CJ, i.e., (L', CJ) � f=, then there are t 1 , t2 E r with �{tt h} CJ. However, F{t1 ,t2} E follows directly from Fr E, and thus ( E, CJ) � f= f. This shows that also f= f � f= holds, i.e. , unrestricted and finite implication coincide for the class C of FDs. Consequently, L'*

=

L'fin

for any set L' of FDs defined on any nested attribute. Next, we introduce extensions of Armstrong's axioms to the presence of records and lists.

Definition 3.8. The generalised Armstrong axioms for functional dependencies are

y �

X,

X -+ Y, Y -+ Z X -+ Z

These rules are called the reflexivity axiom, the extension rule and the transitivity rule. D

In what follows m. denotes the generalised Armstrong axioms for FDs. Consider the following example as an illustration for some inferences of FDs.

EXAMPLE 3 . 5 . Consider Example 3 . 2 again. The reflexivity axiom tells us that

Factor

(

Integer,Prime

[

Number

]

,Exponent

[

Number

])

carries FDs such as

Factor

(

Prime

[

Number

])

--+ Factor

(

Prime

[

-A

])

. The extension rule allows to infer the FD

from

From

Factor

(

Integer

)

--+ Factor

(

Integer, Prime

[

Number

]

, Exponent

[

Number

])

Factor

(

Integer

)

--+ Factor

(

Prime

[

Number

]

, Exponent

[

Number

])

. Factor

(

Integer

)

--+ Factor

(

Prime

[

Number

] )

and