• No se han encontrado resultados

El ecosistema comunicativo: medios, interfaces y periodismo

1. Comunicación digital interactiva y nuevas ecologías de medios

1.3 El ecosistema comunicativo: medios, interfaces y periodismo

Sometimes the atomic closure is large, although it is easy to see that a faithful BCNF decomposition exists.

Example 2.12. Consider the set of FDs

Σ ={A1 →B1, . . . , An →Bn, B1. . . Bn→C}

Clearly Σ is uncritical, but Σ∗a contains 2n+n elements. The 2n1 elements of Σ∗a not

already contained in Σ can be obtained by substituting any number of attributes Bi in

B1. . . Bn→C with the corresponding Ai.

In such cases, it would be nice if we could quickly decide that Σ is already uncritical and not compute the atomic closure. At other times we would not really need all of Σ∗a

in order to decide whether a faithful BCNF decomposition exists.

Example 2.13. Let Σ be as in example 2.12, and Σ0 be any set of FDs on attributes not

appearing in Σ. Then ΣΣ0 is critical iff Σ0 is critical, and ΣΣ0 has an uncritical cover

In the following we will develop some criteria that sometimes allow us to avoid com- puting all of Σ∗a. The basic idea is the following: If an atomic FD X A is critical, it

must contain some other atomic FD Y B, with A Y XA, B X. Thus there is a cycle between A and B in the sense that A is used to determine B, and vice versa. If an atomic FD participates in no such cycle, then it is never contained in another atomic FD, and thus not needed for checking criticality. We will show that it will not be needed to substitute a critical FD either, so that we can avoid computing FDs with this property entirely.

A similar approach was taken by Majster-Cederbaum in [34], where the complete absence of cycles was used as a sufficient but not necessary condition for the existence of a faithful BCNF decomposition. This condition can be checked in polynomial time.

Definition 2.28. Let Σ be a set of FDs andA, B be attributes. We say that A partially determines B, written A →p B, if Σ contains a FD X Y with A X, B Y. We denote the “partially determines” relation w.r.t. Σ by RΣ.

As relations can be regarded as directed graphs, we use terminology from graph theory. For us a cycle is a directed path which starts and ends at the same vertex. An important point to note is that we allow vertices and arcs to appear multiple times in a cycle. The reason for using such a general notion of cycle may not be obvious, since critical atomic FDs always participate in a cycle which visits vertices and arcs only once (even stronger: a cycle of length two). However, it turns out that it is easier to check whether two vertices participate in a general cycle, because this property is transitive: If A, B lie in a cycle, and B, C lie in a different cycle, then A, C also lie in some cycle.

Lemma 2.29. Let Σ be a set of atomic FDs, and Y B, X A Σ be different. If

Y B ⊆XA, then A and B participate in a cycle in RΣ.

Proof. Assume B /∈X and thusB =A. ThenY ⊆X, and since Y →B and X →A are different, Y (X. But this is not possible since Y →B and X →A are both atomic.

Assume now A /∈ Y, and thus Y X \B. This would imply (X \B) A Σ,

which is a contradiction since X →A is atomic.

We therefore haveB ∈X and A ∈Y, and thus B →p A and A →p B, which is a cycle in RΣ.

In Lemma 2.29 we just show that A and B partially determine each other w.r.t. Σ, and it depends on the choice of cover whether Apartially determines B or not, even if we only consider atomic covers.

Example 2.14. Consider the set of FDs Σ ={A→B, B →C}. Then Adoes not partially determine C w.r.t. Σ, but it does w.r.t. Σ∗a = Σ∪ {AC}.

However, the transitive closure of the “partially determines” relation→p is independent of the choice of atomic cover, as we show next.

Theorem 2.30. Let Σ be a set of atomic FDs, and let RΣ and RΣ∗a be the “partially

determines” relations w.r.t. Σ and Σ∗a. Then the transitive closures R+

Σ and R∗a of RΣ

Proof. By Theorem 2.5 every atomic FD in Σ∗acan be derived from Σ using the resolution

rule (2.1):

X →A AY →B XY →B

Let Gbe the set of all FDs derivable this way. Since ΣΣ∗a Gand thus

R+Σ ⊆ R+Σ∗a ⊆ R+G

it suffices to show that R+

G is included in R+Σ. For this we only need to show that the

derived FD XY B does not add to R+

G. The only partial determinations that are

added by XY B though are of the form C →p B with C XY. If C X then

C →p A, A →p B ∈ RG and thus C p

B was already contained in R+

G. If C Y then

C →p B was already contained in RG.

Note that, since all covers of Σ have the same atomic closure, Theorem 2.30 shows that all atomic covers of Σ generate the same “partially determines” relation after taking the transitive closure. This does not hold for non-atomic covers Σ0, which can add arbitrary

attributes to the LHSs of FDs and thus generate arbitrary extra arcs in RΣ0.

Most importantly, Theorem 2.30 assures us that, instead of testing whether an at- tribute participates in any cycle in RΣ∗a, we only need to test whether it participates in

a cycle in RΣ. This is because taking the transitive closure does not affect whether an

attribute participates in a cycle.

Definition 2.31. In the following we call attributes that participate in some cycle inRΣ

cyclic. We call a FD cyclic if at least one of its LHS attributes participates in a cycle with one of its RHS attributes. An attribute or FD that is not cyclic is called acyclic.

We use the atomic closure for two purposes. The first use is to test whether a FD is critical. We have seen that we only need cyclic FDs for this task, since acyclic atomic FDs are never contained in other atomic FDs. The second use is to replace critical FDs. To avoid computing acyclic FDs, we need to show that we can replace critical FDs without them.

Definition 2.32. The cyclic closure Σ∗c of a set Σ of FDs is defined as

Σ∗c:={X AΣ∗a|X A is cyclic}.

Note that taking the cyclic closure is not a closure operation in the mathematical sense. We chose the name since it contains the cyclic FDs in the atomic closure, which in turn is not actually a closure either, but rather contains the atomic elements in the closure.

Lemma 2.33. LetX →A∈Σ∗a be critical and implied by a minimal set SΣ∗a. Then

every FD Y →B ∈S with X∗ =Y is cyclic.

Proof. Since S is minimal, B is used in deriving X A, and thus B precedes A in R+Σ. By Lemma 2.29 every critical FD is cyclic, in particular X A, so there exists an attribute C X which is preceded by A, and thus by B. Either C Y, or, as X →Y

is LHS-minimal, C is used to derive some attribute C0 Y and thus precedes it. In the

latter caseB precedesC0. SinceY is atomic,CorC0 precedes Bas well, soB participates

The above lemma does not allow us to replace critical FDs, which are implied by an uncritical subset of Σ∗a, with uncritical FDs in Σ∗c. This is indeed not always possible:

Example 2.15. Consider the set of atomic FDs Σ =    ABC →D, D →C, ABC →F, EF →D, AB →G, G→B, G→E   

with the atomic closure Σ∗a= Σ

  

ABF →C, ABF →D, ABD→F, AGC →D, AGC→F, AGD →F, AB →E, EF →C, GF →C, GF →D

  

Note that Σ does not possess an uncritical cover: This is because the critical FDAB →G

cannot be replaced. While this means that we cannot get a faithful BCNF decomposition, we could still try reduces the number of schemas which violate BCNF by replacing the FD ABC →D, which is also critical.

It is easy to see that ABC →D is implied by the uncritical FDs

S={ABC →F, AB →E, EF →D} ⊆Σ∗a

Indeed all the FDs in S are needed, i.e., every uncritical set S0 Σ∗a that implies

ABC →D is a superset of S. However, the FD AB →E is not contained in Σ∗c = Σ∗a\ {G→E, AB →E}

which we can find by computing the strongly connected components of RΣ:

SCC(RΣ) ={A, BG, CDF, E}

This means that ABC →D is not implied by any uncritical subset of ΣΣ∗c. However,

ABC →D is implied by, and can be replaced with

{ABC →F, AB →G, G→E, EF →D} ⊂ΣΣ∗c

in which only AB →G is critical. Since we need to keep AB →G anyway, this replace- ment is just as good as a replacement with uncritical FDs.

We generalize this idea next.

Definition 2.34. For a set G⊆Σ∗a of FDs we denote by

crit(G) := {X →A ∈G|X →A is critical} the set of all critical FDs in G. Similarly

uncrit(G) :=G\crit(G)

Note again that critical FDs are always cyclic by Lemma 2.29.

Lemma 2.35. Let Σ be a minimal set of FDs such that Σ ² X Y. Then for all

Proof. We have Σ ² X Y iff Y X∗, and X can be constructed using only FDs

S T Σ with S X∗. Since Σ is minimal, all FDs in Σ must be used in the

construction. Clearly S ⊆X∗ is equivalent to Σ ²X S.

Theorem 2.36. Let Σbe a set of atomic FDs and G⊆Σ∗a be a cover for Σ. Then there

exists a cover H ΣΣ∗c of Σ with crit(G) = crit(H).

Proof. Choose H maximal as

H :=uncritΣ∗c)crit(G)

By Lemma 2.29 all critical FDs are cyclic, and thus H ΣΣ∗c. The equalitycrit(G) =

crit(H) holds by definition, so we only need to show that H is a cover for Σ.

Clearly Σ ² H and H ² uncrit(Σ), so let X A be a minimal (w.r.t. LHS- implication) critical FD in Σ which is not implied by H. Since G is a cover of Σ, there exist a minimal subset S⊆G with S ²X →A. We show thatH implies S.

By Lemma 2.33 every FD Y B S is either cyclic or, since X Y Σ by

Lemma 2.35, smaller than X A w.r.t. LHS-implication. In the former case, Y B

is contained in H and thus implied. In the latter case it is implied by a minimal subset

T Σ which, again by Lemma 2.35, contains only FDs smaller that X A. Since we chose X →A∈Σ minimal, all FDs in T are implied by H.

If we take the set of critical FDs as a measure of how “good” a cover is for BCNF decomposition, the above theorem assures us that we can always find an equally good cover by restricting our search to Σ Σ∗c rather than Σ∗a. In particular we get the

following:

Corollary 2.37. Σhas an uncritical cover (and thus a faithful BCNF decomposition) iff it has an uncritical cover in ΣΣ∗c.

The question remaining is whether we can compute the set Σ∗c efficiently without

computing all of Σ∗a. This is possible.

Lemma 2.38. Let X A, AY B Σ∗a. If AY B is acyclic then so is the FD

XY →B Σ, derived by the resolution rule (2.1):

X →A AY →B XY →B

Proof. Let XY B be cyclic. Then B →p C lies in R+

Σ for some C XY. If C Y

then AY B is clearly cyclic. If C X then C →p A and A →p B, and again AY →B

is cyclic.

Since LHS-minimization does not make an acyclic FD cyclic, all FDs derived from an acyclic FD by linear resolution (with LHS-minimization) will be acyclic as well. Thus we need not use any acyclic FD as base FD and can safely discard it instead (note though that acyclic FDs in Σ are still needed as substituting FDs).

Testing whether a FD is cyclic can be done efficiently by pre-computing the maximal strongly connected components (MSCs) of RΣ, which only requires linear time [41]. The

time complexity for computing Σ∗c with an adapted “linear resolution” algorithm is thus

O¡fc·k2n2

wherefcis the number of FDs in Σ∗c. Similarly, the time complexity for the “least critical

cover synthesis” algorithm using Σ∗c instead of Σ∗a becomes

O¡fc·k2n2 +fc2·k

¢

In conclusion we observe that restricting our computations to cyclic FDs can lead to great speed improvements in cases where Σ∗c is much smaller than Σ∗a, while generating only

a small computational overhead (for testing cyclicity) when all or most FDs in Σ∗a are

cyclic.