4.2 RECOPILACIÓN DE NORMAS Y ESPECIFICACIONES PARA DISEÑO
4.2.4 PROCEDIMIENTO PARA LA DOSIFICACIÓN DE MEZCLAS H.A.R
In order to present and prove correct the STAP provenance model, this chapter for- mally defines the system model that the provenance system will be targeting and deployed on. We model general distributed systems as state transition systems, in which the execution logic (or behavior) of each individual node is captured by a set of derivation rules. Based on the semantics of the defined state transition systems, we further present the definition of execution traces, which capture the dynamism
2.6. Summary of distributed system executions. In the subsequent section, we define the formal provenance model and show its close connection to the trace representation of sys- tem executions.
Chapter 3
Provenance Model
We use the system model described in the previous chapter as a basis for formalizing STAP. In this chapter, we present a (slightly) simplified version of STAP, calledTime-aware Provenance(TAP), that assumes a trusted environment. We defer the discussion of STAP’s security enhancement for untrusted environments to Chapter 6.
Given a distributed system, TAP is used to provide an explanation as to why a given tuple τ or update event is located on node Ni at timet. Tuple τ can be
viewed as amaterialization pointthat applies a sequence of the update events onτ. Intuitively, the answer for a provenance query on the existence ofτ on nodeNi at
timetcan be formulated as a sequence of query results for the update events (up to timet) onτ. Hence, we focus our discussion on the provenance of update events.
3.1
TAP Provenance Model
TAP encodes the provenance for a traceE in a graphG(E) = (V, E)in which each vertex v ∈ V represents an event in E, and each edge (v1, v2) ∈ E represents a
direct dependency between two such events. TAP’s provenance graph can contain the following six types of vertices:
3.1. TAP Provenance Model
• INSERT(t,n, τ)andDELETE(t,n, τ): Tupleτ was inserted (deleted) on nodenat
timet.
• DERIVE(t,n, R, τ) and UNDERIVE(t,n, R, τ): Tuple τ was derived (underived)
via ruleRon nodenat timet.
• SEND(t,n,4τ,n0)and RECEIVE(t,n0,4τ,n): An update4τ was sent (received)
on nodenat timetto (from) noden0.
The last two vertices are needed because a derivation on one node can involve tuples on another; the corresponding messages are represented explicitly inG. The vertices are generated and connected according to the following rules:
• When a base tuple is inserted, anINSERTvertex is added.
• If a nodeNi derives a tupleτ via ruler, a DERIVEvertex is added, which has
incoming edges from all ofr’s preconditions, as well as from the triggering event, i.e., theINSERTthat causedrto fire. TheDERIVEvertex is then connected to a newINSERTvertex (ifτ is local toNi) or a new SENDvertex (ifτ is sent to
another node).
• When a message is received from another node, a RECEIVE vertex is added, with an incoming edge from the corresponding SEND vertex. This vertex is then connected to a newINSERTvertex.
• Whenever anINSERTvertex is added for a tupleτ that already has at least one derivation, an incoming edge is added toτ’s most recentINSERTvertex (recall that tuples can have more than one derivation).
• When a tuple τ1 replaces another tupleτ2 due to a primary-key or aggrega-
tion constraint, anupdate edge is added fromτ1’sINSERTvertex toτ2’sDELETE
Chapter 3. Provenance Model The guidelines for deletions and underivations are analogous. Note that the graph is acyclic because edges are always added between an existing vertex and a new vertex, but never between two existing vertices. It is also monotonic because, as the execution continues, new vertices and edges are added but never removed.
Given the instantiated provenance graphG(E), the provenanceG(4τ,E)of an update event4τ on node Ni at timet is simply the subtree ofG(E)that is rooted
at the correspondingINSERT(t, Ni, τ)(orDELETE(t, Ni, τ)) vertex.
WhenG(4τ,E)includes aDERIVEvertex for some ruleα:-α1, α2, . . . , αk, it in-
cludes the provenance ofeachαi, not just the provenance of the tuple (say,α1) that
triggered the rule. This is helpful when the provenance is used to explain theexis- tenceofτ, sinceαis a (direct or indirect) precondition forτ and eachαi is equally
responsible forα’s existence. However, when provenance is used to explain a state change, i.e., the appearance or disappearance ofτ, only the provenance of the trig- gering tuple (hereα1) is relevant; the others merely clutter the graph. Because of
this, TAP can optionally replace each subtree for a non-triggeringαi with a single
EXIST vertex and a snapshotsummarizing the current state at a particular node, as
it was computed by applying all events up to the current time. State snapshots are discussed in more detail in Section 3.2.
Example: MINCOST Routing. Let us revisit our running example from the pre-
vious sections. Figure 3.1 shows a piece of the TAP graph that explains the dele- tion of the tuplemincost(@c,a,5)on nodecat timet3that resulted from the new
link a-c that was inserted at time t0. Specifically, the edge at the DELETE vertex
of mincost(@c,a,5) (indicated by a dotted line) corresponds to an aggregation
constraint — that is, the minimal cost changed because a lower-cost path to nodea
became available. The updated lowest cost (cost(@c,a,4)) was derived on node b at time t2 (and subsequently sent to node c) because a) a link b-c with cost
3.2. Derivations and System Snapshots