evitar la modificación de la información Contable.

In this section, we discuss some further connections between provenance annotations and bag semantics.

We note that by Theorem 4.5, N[X]-containment of UCQs implies bag-containment. Since

the former is decidable and the latter is not, it follows that there exist UCQs for which bag- containment holds butN[X]-containment does not. This justifies the “⇒

_´

” betweenN[X] andN in Figure4.2(d). Also, we can show that:

Proposition4.25. For containment of UCQs, we have

1. N6⇒B[X]andB[X]6⇒N 2. N6⇒Why(X)andWhy(X)6⇒N 3. N⇒

_´

Lin(X)

This justifies the “⇒

_´

” betweenNand lineage in Figure4.2(c) and shows thatNis incomparable

there withB[X]andWhy(X).

Next, the “⇔” betweenNandN[X]in Figure4.2(d) follows from the following result: Theorem4.26. For UCQsP, ¯¯ Q we haveP¯≡_NQ iff¯ P¯≡_N_[_X_] Q¯

Proof. N[X] ⇒Nfollows from Theorem4.5. We prove N⇒N[X]by contrapositive. Suppose

P6≡_N_[_X_] _{Q. Then for some}¯ _N[X]-instanceIand tuplet, we have ¯P(I)(t) =Aand ¯Q(I)(t) =Band A6=B. Since AandBare non-identical polynomials, one can always find a valuationν :X→N

such thatEvalν(A)6=Evalν(B). By Proposition2.5, we have ¯P(ν(I)(t))6=Q¯(ν(I)(t)). Since ν(I)

is anN-instance, it follows that ¯P6≡_NQ. Therefore¯ N[X]6⇒N, as required.

By Theorem4.23it follows from the above that bag equivalence of UCQs is also the same as

isomorphism. Prior to receiving the reviews of the paper upon which this chapter is based, it seemed to us that the community considers the decidability of equivalence of UCQs under bag semantics an open problem. However, one of the referees pointed out (as related work on bag-set semantics) the papers [30, 31]. [30] stated the result that bag-set equivalence of UCQs (called

disjunctive queriesthere) is the same as isomorphism, and added as an observation that this also holds for bag semantics. The outline of the proof of the bag-set semantics result is provided in [27]

and although bag semantics is not discussed further there we have observed that, in fact, results on bag-set semantics do correspond to results on bag semantics via the followingtransfer lemma:

Lemma4.27. There exists a mapping ϕ :CQ→CQ (which we extend to UCQs by applying it compo-

nentwise on CQs), a mapping f from bag instances to set instances, and a mapping g from set instances to bag instances, such that for any UCQQ, bag instance I, and set instance J, we have¯

1. JQ¯K I ₌ Jϕ_Q¯K f(I) 2. JϕQ¯K J₌ JP¯K g(J)

whereϕQ¯ is the result of ϕapplied toQ.¯

Proof. We first define ϕ: CQ→CQ, as follows. LetQbe a CQ over schemaΣ, and letΣ0 be the

schema obtained fromΣby replacing eachn-ary relational predicateRwith ann+1-ary relational predicateR0. Then ϕmaps Qto the CQ ϕQover schema Σ0 obtained from Qby replacing each join atomR(x1, . . . ,xk)with an atomR0(x1, . . . ,xk,u)whereuis a fresh variable. For example, if Qis the CQ

Q(x,y) :- R(x,y),R(y,z)

then ϕQis the CQ

ϕQ(x,y) :- R(x,y,u),R(y,z,v)

We extendϕto map a UCQ ¯Qto a UCQ ϕ_Q¯ by applying it componentwise on CQs.

Next we define an encoding f of bag-instances over schemaΣto bag-set instances over schema

• f maps a tuple t in relation R with multiplicity k to kdistinct tuples t0₁, . . . ,t0_k in R0 each obtained fromtby appending a fresh constant in the last column

• gassigns to tupletinRthe multiplicitykwherekis the number of tuplest0inR0 such that tandt0agree on all columns oft

It is straightforward to verify thatϕ, f, andgsatisfy the conditions required by the Lemma.

Lemma4.27implies that bag-containment (bag-equivalence) of CQs/UCQs is polynomial time

reducible to bag-set containment (bag-set equivalence). Moreover, for UCQs ¯P, ¯Q, the transforma- tion ϕ defined in the proof of Lemma4.27satisfies ¯P ∼= Q¯ iff _ϕ(P¯) ∼= _ϕ(Q¯). Thus Lemma 4.27

transfers to bag semantics the isomorphism results for equivalence under bag-set semantics.

Trio

For CQs,Trio(X)-containment turns out to coincide withWhy(X)-containment:

Theorem4.28. For CQs P,Q we have Pv_Trio₍_X₎Q iff Pv_Why₍_X₎Q.

Therefore, Theorem4.11(Theorem4.12) applies toTrio(X)-containment (equivalence) as well,

and we have a “⇔” betweenTrio(X)andWhy(X)in Figure4.2(a) and Figure4.2(b).

To establish the decidability ofTrio(X)-equivalence of UCQs, we note thatTrio(X)contains an embedded copy ofN, henceTrio(X)⇒Nfor UCQs. Combined with Theorem4.26this implies: Theorem4.29. For UCQsP, ¯¯ Q we haveP¯≡_Trio₍_X₎Q iff¯ P¯∼=Q.¯

This justifies the “⇔” betweenTrio(X)andNin Figure4.2(d).

To establish decidability in pspace of Trio(X)-containment of UCQs, we follow a similar ar-

gument as used forN[X]in Section4.4. However, this time the argument is complicated by the

fact that unlikeN[X],Trio(X)cannot easily “simulate its own computations” by factoring them through calculations involving abstractly-tagged databases. As a simple example, consider the CQs

Q1(x,y):-R(x,z),R(z,y) Q2(x,y):-R(x,y)

and consider the followingTrio(X)-relationRand its abstractly-tagged version:

R def= a a 2 b c pq ab_Trio₍_X₎(R) = a a r b c s Note that JQ1K R₍_{a, a}_{) =} _{4 while} JQ2K

R₍_{a, a}_{) =} _{2. On the other hand,}

JQ1K

abTrio(X)(R)₍_{a, a}_{) =}

JQ2K

abTrio(X)(R)₍_{a, a}_{) =}_p_{(recall that}_Trio₍_X₎_{“drops exponents” in its computations). Hence we do}

To overcome this limitation, we introduce variation of abstractly-tagged instances called k- abstractly tagged instances. The tags in these instances aresumsofkfresh variables fromX. As we shall see, these turn out to suffice for Trio(X) to “simulate its own computations” for UCQs of degree≤k.

To make this precise, we order the variables ofX = {x1,x2, . . .}, and we define the set Sk ⊆

Trio(X)of sums ofkfresh variables fromX:

def

= {(x1+· · ·+xk),(xk+1+· · ·+x2k), . . .}

The k-abstractly tagged version ab_k(I) of a Trio(X)-instance I is the Trio(X)-instance obtained by replacing each tuple’s annotation inIwith a fresh element ofSk. For example, considering again theTrio(X)-relationRfrom earlier, we have

abk(R) =

a a r+s b c u+v

Now, we defineTriok(X)⊆Trio(X)to be the set of annotations fromTrio(X)obtained via calculations of degree≤kinvolving elements ofSk:

Trio_k(X)def= {Eval_ν(a):a∈N[X],ν:X→Sks.t.ahas degree≤kandνis injective} Proposition4.30. Suppose valuationν :X →Sk is injective. Then for any a,b∈ N[X]of degree ≤k, a=b iffEval_ν(a) = Eval_ν(b). As a consequence, any element c∈ Trio_k(X)decomposes uniquely into a sum of products of elements of Sk.

To illustrate, consider the previous example, and note thatJQ1K

abk(R)₍_{a, a}_{) =}_r₊_2rs₊_s_which

decomposes uniquely to the expression (r+s)(r+s) involving only tags fromabk(R). On the other hand,JQ2K

abk(R)₍_{a, a}_{) =}_r₊_s_{and now we have enough information to recover the results}

ofJQ1K

R_and

JQ2K

R_{(by replacing}_r₊_s_{with 2 in the calculations).} We are now ready to state the analog of Lemma4.20forTrio(X):

Lemma 4.31. Suppose P, ¯¯ Q ∈ UCQ have degree ≤ k, and suppose I is a Trio(X)-instance such that

JP¯K I _6≤ Trio(X)JQ¯K I_{. Then} JP¯K ab_k(I)_6≤ Trio(X)JQ¯K ab_k(I)_.

Proof. (Sketch) Let I0 = abk(I)and letν : Sk →Trio(X) be the valuation which records how the annotations from Iwere replaced by elements ofSk. (Thusν(I0) = I.) We claim thatνextends to

a mappinghν :Triok(X) →Trio(X) that behaves like a semiring homomorphism, so long as the computations stay withinTrio_k(X):

1. h

2. for alla,b∈Trio_k(X),h

ν(a+b) =hν(a) +hν(b)

3. for alla,b∈Trio_k(X), ifa·b∈Trio_k(X)thenh

ν(a·b) =hν(a)·hν(b)

Using Lemma4.31, we can show thath

νexists and is uniquely determined by the above properties.

Since ¯P, ¯Q have degree ≤ k, it is not hard to see that all annotations in JP¯K

I0 _and

JQ¯K

I0 _{are in}

Triok(X). Also, it is not hard to show thathν enjoys two relevant properties of proper semiring

homomorphisms with respect to query evaluation:

• hv is compatible with the natural order: ∀a,b∈ Triok(X), if a≤Trio(X) bthen hν(a) ≤Trio(X) hν(b)

• Query evaluation commutes withhν:hν(JP¯K

I0_{) =} JP¯K hν(I0)= JP¯K I_and_h ν(JQ¯K I0_{) =} JQ¯K h(I0)₌ JQ¯K I

As a consequence, we have thatJP¯K

I0_≤ Trio(X)JQ¯K I0_implies JP¯K I_≤ Trio(X)JQ¯K I_.

Thus, we have established a bound on the size of annotations of the counterexamples that need to be considered. The last step is to additionally bound the number of tuples and establish our “small counterexample property”:

Theorem4.32. For UCQsP, ¯¯ Q of degree≤k, ifP¯6v_Trio₍_X₎_{Q, then}¯ _J_P¯_KI_6≤

Trio(X)JQ¯K

I_{for a k-abstractly} taggedTrio(X)-instance I containing at most`tuples, where`is the maximum number of atoms occurring in a CQ inQ.¯

Proof. (Sketch) Very similar to the proof for Theorem4.21.

Corollary4.33. Trio(X)-containment of UCQs is decidable inptime.

Finally, we note that one can find examples of UCQs showing that N[X]

´

⇒ Trio(X) and Trio(X)

´

⇒N, as indicated in Figure4.2(d).

4.5 Datalog

We have seen that for UCQs, containment under various provenance semantics turns out to be decidable, even for models such asN[X]andTrio(X)which are closely related to bag semantics. Is it possible that these decidability results extend to Datalog? We leave this question open for most of the provenance models considered in this dissertation, but give two results in this section suggesting that the answer is likely negative, even for the easier question of the two, equivalence.

Under set semantics, Shmueli [128] has shown that equivalence (and therefore also contain-

ment) of Datalog queries is undecidable. This result immediately extends toK-containment and K-equivalence of Datalog programs forKa distributive lattice by dint of the following observation:

Theorem 4.34. For any distributive lattice K and Datalog programs Q1,Q2, we have Q1 vK Q2 iff

Q1vBQ2.

This generalizes an earlier result [56] (later rediscovered in [62]) from UCQs to Datalog pro-

grams. Since PosBool(X) is a distributive lattice, this settles the question negatively for that provenance model.

For the other provenance models, it is possible that undecidability of Datalog-equivalence may be established by reduction from the same problem for set semantics, along the lines of the following:

Theorem4.35. N∞-equivalence of Datalog programs is undecidable.

Proof. By reduction from the same problem for set semantics. IfQis a Datalog program, letQ0be the Datalog obtained fromQby renaming the output predicate toQ0, and adding one more rule:

Q0(x1, . . . ,xn):-Q0(x1, . . . ,xn)

where nis the arity of the output predicate. This rule effectively “amplifies” the multiplicity of any output tuple in the query result to∞. Now given Datalog programsQ1,Q2, letQ01 and Q20

be the corresponding Datalog programs derived from them as above. We claim thatQ1≡BQ2iff

Q1≡N∞ Q2. To see this, consider an arbitrary inputN∞-instance I, and let J be the set instance

corresponding to its support. For any output tuplet, observe that

JQiK J₍_t_{) =}₀ _⇐⇒ JQ 0 iK I₍_t_{) =}₀ JQiK J₍_t_{) =}₁ _⇐⇒ JQ 0 iK I₍_t_{) =}_∞

Since any set instance J corresponds to the support of someN∞-instance I, this establishes the claim.

Appendix

Definition4.36(Semiring homomorphism). Let K1,K2be semirings. A mappingh : K1 →K2is

called asemiring homomorphism if h(0) = 0, h(1) = 1, and for all a,b ∈ K1, we have h(a+b) =

Proposition4.37. Let K1,K2be naturally-ordered commutative semirings. If h:K1→K2is a semiring

homomorphism then for all a,b ∈ K1, a≤K1 b =⇒ h(a) ≤K2 h(b). If h is also surjective, then for all

a,b∈K1, a≤K1 b ⇐⇒ h(a)≤K2 h(b).

Proof. Straightforward calculation.

Given a semiringKdefine † :K→Bas follows: †(0) def= false

†(a) def= truewhena6=0

Proposition4.38. The following are equivalent:

1. †is a semiring homomorphism

2. K satisfies

a) 06=1

b) a+b=0implies a=0or b=0 c) ab=0implies a=0or b=0

A semiring K is calledpositive if it satisfies either of the (equivalent) statements in Proposi- tion4.38.

Definition4.39(Congruence relation). If Kis a semiring and≈is an equivalence relation on K, then we say that≈ is acongruence relationonK if a≈ a0 and b ≈ b0 impliesa+b ≈ a0+b0 and a·b≈a0·b0.

Definition4.40(Quotient semiring). LetK be a semiring and let≈be a congruence relation on K. If a ∈ K then denote the equivalence class ofa in≈ by a/≈. Then the quotient of K by ≈is the semiring whose domain is the set K/≈ of equivalence classes of≈, 0def= 0K/≈, 1

def

= 1K/≈,

Chapter

5

In document Elaboración del manual de politicas contables NIIF para Pymes del Hotel La Bohemia Ltda (página 68-78)