3- Final de estacionamiento
6.4 INTERPRETACIÓN DE ZONAS AZULES
We present below a suite of algorithms,A , for fast read-only transactions. To comply with our Theorem 7, we restrict all transactions to be read-only and updates to be outside transactions (or equivalently be considered as single-object write transactions). The goal ofA is to better understand our Theorem 8. Theorem 8 shows that fast read-only transactions are visible. The intuition of Theorem 8 is that after a fast read-only transaction T , servers may need to communicate the information of T among themselves. However, it is not clear when such
communication occurs. The COPS-SNOW [44] algorithm shows that the communication can take place during one client request of write. A below shows that the communication can actually take place outside any client request of write and asynchronously. Different from COPS-SNOW where a value written is visible immediately after the write,A guarantees only eventual visibility.
Algorithm 8 Client-side read/write algorithms
1: local variables
2: l c, logical clock
3: c t x, context
4: end local variables
5: functionWRITE(ob j , v al ) 6: Identify server S by ob j 7: c t xS, l cS← S.write(lc, ctx, ob j, val) 8: update_lc(l cS) 9: update_context(ob j , l cS, c t xS) 10: return OK 11: end function
12: functionREAD(ob j s) 13: t x I D← generate_txID()
14: f i xedC t x← ctx
15: for ob j in ob j s do
16: v al , ver , c t xS, l cS← S.read(lc, f i xedC tx, ob j, txID) 17: save v al to v al s 18: update_lc(l cS) 19: update_context(ob j , ver , c t xS) 20: end for 21: return v al s 22: end function Protocol
We describe first the data structure which each process maintains. All processes maintain locally their logical timestamps and update their timestamps whenever they find their local ones lag behind. They also move their logical timestamps forward when some communication with other processes is made. (The function call inA is update_lc of which the details are omitted for the simplicity of presentation.) Every client additionally maintains the causal dependencies of the current transaction (i.e., the transactions each of which causally precedes the current one). The maintenance of causal dependencies can be done in a similar way as in COPS [35] and COPS-SNOW [44]. (Our algorithmA maintains causal dependencies in variable c t x by function calls of update_context and c t x.update. The details are the same as COPS [35] and COPS-SNOW [44] and thus omitted.) Every client is able to generate transaction identifiers (by a function call of generate_txID inA ). Every server needs to store the causal dependencies which a client passes as an argument during its write. Every server additionally
Algorithm 9 Server-side read/write algorithms
1: local variables
2: l c, logical clock
3: vi s, visible versions in tuples <ob j , ver >
4: ol d T x and cur r T x, storage of tuples <ob j , ver , c t x, t x I D> for each object
5: end local variables
6: functionWRITE(l cC, c t xC, ob j , v al ) 7: update_lc(l cC)
8: c t x← the context of ob j with the highest version in the storage
9: c t x.update(c t xC) 10: update_storage(ob j , v al , l c, c t x) 11: return c t x, l c 12: end function 13: functionREAD(l cC, c t xC, ob j , t x I D) 14: update_lc(l cC) 15: if t x I D∈ oldT x then
16: ver← the version identified by txID in oldT x
17: else
18: vvi s← the highest version of ob j in vi s 19: if <ob j , v> is in c t xCand v> vvi s then
20: ver← v
21: else
22: ver← vvi s
23: end if
24: end if
25: save <ob j , ver , c t xC, t x I D> to cur r T x
26: v al← the value identified by ver of ob j in the storage
27: c t x← the context identified by ver of ob j in the storage
28: return v al , ver , c t x, l c
29: end function
maintains a data structure called ol d T x for each object stored.
We next sketch how writes and read-only transactions are handled. The full algorithms are shown in Algorithm 8 and Algorithm 9.
• Every client sends its logical timestamp as well as causal dependencies when requesting a write of object ob j . A server uses the server’s updated logical timestamp as the version ver of the value v al written, stores the version and the value along with the causal dependencies c t x (by a function call update_storage(ob j , v al , ver , c t x) in Algorithm 9), and returns the version number to the client.
• Every client C sends its logical timestamp when requesting a read-only transaction t x. A server first searches t x in ol d T x, and returns a pre-computed value according to entry t x in ol d T x if t x∈ oldT x. Otherwise, a server returns some value previously observed
Algorithm 10 Server-side asynchronous check
1: local variables
2: Same as in Algorithm 9 3: end local variables
4: when all versions of ob j below ver are in vi s, invoke async_check 5: procedureASYNC_CHECK(ob j , ver )
6: identify c t x by ob j , ver in the storage 7: for ob jd, verdin c t x do 8: identify server D by ob jd 9: ol d T xD, l cD← D.async_checkVis(ob jd, verd, l c) 10: update_lc(l cD) 11: save ol d T xDto ol d T x as follows: 12: for t x I D in ol d T xDdo 13: if t x I D∉ oldT x then
14: get tuple <ob jd,∗, ctxd, t x I D> from ol d T xD
15: identify version vpr evas the highest version below ver of ob j in the storage 16: if <ob j , v> is in c t xdand v> vpr ev then
17: save tuple <ob j , v,−, txID> into oldT x
18: else
19: save tuple <ob j , vpr ev,−, txID> into oldT x
20: end if
21: end if
22: end for
23: end for
24: for t x I D in cur r T x do
25: if <ob j , v,∗, txID> is in cur r T x and v < ver then
26: move the tuple identified by t x I D from cur r T x to ol d T x
27: end if
28: end for
29: save< ob j,ver > into vi s 30: end procedure
31: functionASYNC_CHECKVIS(ob jd, verd, l cS) 32: update_lc(l cS)
33: when< ob jd, verd> is in vi s, return oldT x, lc 34: end function
by C or some value marked as “visible”.
We here sketch how ol d T x is maintained and communicated (during asynchronous propaga- tion). The full algorithm is shown in Algorithm 10.
• After a server S responds to a client’s write request of value w for some object o, S sends a request to every server which stores some value v such that w (o)vw (o)w . Any server responds such request with its local ol d T x when v is marked as “visible”.
• After S receives a response from all servers which store some value that causally precedes w , S stores their ol d T xs into S’s local one, chooses a value w∗which is written before w9
Any read-only transaction is stored and marked as “current” during its execution at any server. A “current” transaction T is put in ol d T x when some value w is “visible” and T has returned a value written before w of the same object.
Proof of correctness
Our suite of algorithms A above provides fast read-only transactions. As every message eventually arrives at its destination (and therefore asynchronous propagation eventually ends),A satisfies progress. As asynchronous propagation carries transaction identifiers, A is visible. In what follows, we show thatA satisfies causal consistency.
In Algorithm 9, when a server stores a value, the server chooses a version number strictly greater than all values of the same object previously written. Therefore in addition to relation , we also enforce an ordering on all writes of the same object by their version numbers. In what follows, we say that two writes w1→ w2, if (1) w1is of a lower version number than w2
and w1, w2write the same object; or (2) w1w2; or (3)∃ some write w3such that w1→ w3
and w3→ w2. We first show a property for any read-only transaction in Lemma 9. We then
prove the correctness ofA based on Lemma 9.
Lemma 9 (A correct snapshot for visible fast read-only transactions). Let T be any transaction that contains at least two reads. Given any two reads r (a)u, r (o)v∗∈ RT, if∃w(a)u∗such that
w (a)u is of a lower version number than w (a)u∗, then w(a)u∗→ w(o)v∗does not hold.
Proof of Lemma 9. By contradiction. Suppose that r (a)u, r (o)v∗∈ RT and w (a)u∗→ w(o)v∗
holds. According to Algorithm 9, there are three possibilities when the server Pothat stores
object o returns v al= v∗at t x I D= T : 1. t x I D∈ oldT x;
2. t x I D ∉ oldT x but for object o, ctxC specifies a version v, higher than the highest
version vvi sin vi s of the same object;
3. t x I D∉ oldT x; and for object o, ctxCdoes not specify a version or any specified version
v is lower than vvi s.
Let us examine each possibility. First, we look at the second possibility. Then< o,v >∈ ctxC,
v corresponds to v al= v∗at Po, and v> vvi s. The maintenance of variable c t x maintains
9In order to choose a value correctly, in the algorithm, S actually sends a request after all values written before
w (of the same object) are marked as “visible”. Also, S does not choose a value for some t x which S has chosen
the precedences of a transaction (a single-object write transaction or a read-only transaction) according to relation→. We sometimes also say a write is in ctx if the pair of the corresponding object and version number is in c t x. By the maintenance of c t xC, since w (a)u∗→ w(o)v∗,
then w (a)u∗∈ ctxC. However, according to Line 16 of Algorithm 10 and Line 19 of Algorithm
9, if w (a)u∗∈ ctxC, then the server Pa which stores object a is unable to return v al= u of
which the version is lower than that of u∗at t x I D= T .
Next, we look at the third possibility. Then t x I D∉ oldT x. In addition, for object o, ctxC
does not specify a version or any specified version v is lower than vvi s; in either case, vvi s
corresponds to v al= v∗at Po. According to Line 4 and Line 33 of Algorithm 10, when T reads
o at Po, u and u∗are visible (i.e., in vi s) at Pa. Clearly, if T reads a at Pa before Pareplies to
as ync_checkV i s(a, u∗,∗), then Pasends T to Poduring as ync_checkV i s(a, u∗,∗) and Po
could have T ∈ oldT x when T reads o at Po, which gives a contradiction. Therefore, T must
read a after Pareplies to as ync_checkV i s(a, u∗,∗), i.e., after u∗is visible. Thus according to
Line 19 of Algorithm 9, Pamust find T ∈ oldT x when T reads a. Similarly, due to Pa’s reply to
Po’s call of as ync_checkV i s(a, u∗,∗), the first time when Pareceives T must be also after u∗
is visible (while Painvokes as ync_check(a, u1) for some version u1after the version of u∗).
Then according to Line 16 of Algorithm 10, Papre-determines a version no smaller than the
version of u∗for T , which contradicts the return value v al= u of Pa.
Finally, we look at the first possibility. t x I D∈ oldT x. Since Popre-determines v al= v∗for T ,
then either c t xCspecifies v∗for object o or v∗is visible the first time when Poreceives T . The
two cases are similar to the second and third possibilities, leading to contradictions against the return value of Pa. As a result, we conclude that if w (a)u∗→ w(o)v∗holds, then T cannot
have both r (a)u and r (o)v∗, which is equivalent to Lemma 9.
Proof of causal consistency. By contradiction. Suppose that some execution E violates causal consistency. Then in E , some client C ’s local history cannot be totally ordered to satisfy Definition 7. Clearly, without any read-only transaction, we can order all writes in a way that respects relation→ defined previously (which includes the relation of causalitybetween any two writes). Therefore C does at least one read-only transaction. In order to incorporate C ’s read-only transactions, we extend the relation→ defined previously. Consider the set T X of transactions that consist of all writes in E and all C ’s read-only transaction. For any two transactions t x1and t x2, we say that t x1→ tx2, if (1) t x1and t x2are two writes, t x1is of a
lower version number than t x2and t x1, t x2write the same object; or (2) t x1t x2; or (3)∃
some t x3∈ T X such that tx1→ tx3and t x3→ tx2.
Let t owbe any ordering that respects relation→. We then add C’s read-only transactions in
t ow one by one. Since we suppose that E violates causal consistency, we let T be the first
read-only transaction such that some t ow exists which can include C ’s read-only transactions
before T but for any t ow, C ’s read-only transactions up to and including T cannot be placed
Let A be the set of such ordering t owthat can include C ’s read-only transactions before T and
let t o1be any ordering in A. We first show that T must read at least two objects, the proof
of which is by contradiction. Suppose otherwise that RT = {r (a)u}. Let the last transaction
(which can be a read-only transaction or a write) done by C before T isα. Let the first write done by C after T isβ. Then in any to1 where all C ’s read-only transactions before T are
included, either (1) w (a)u is beforeα, or (2) β is before w(a)u, or (3) w(a)u is between α and β. In the third case, we put T immediately after w(a)u. In the second case, β → w(a)u does not hold. (Suppose otherwise thatβ → w(a)u holds. Then the logical timestamp l1which
the client of w (a)u receives from Pa during w (a)u is higher than the logical timestamp l2
which C receives from the server that stores the object written byβ during β. However, when T reads a, the logical timestamp which C receives from Pais at least l1, and as a result, the
value of l2≥ l1, a contradiction.) We moveβ and its successors of relation → after w(a)u. The
resulting ordering is still in A. We then put T immediately after w (a)u. In the first case, there are two possibilities: (i) between w (a)u andα, there is some write w(a)u∗; (ii) between w (a)u andα, there is no write w(a)u∗. For the latter, we put T immediately afterα. For the former, let w (a)u∗be the first write of object a after w (a)u in t o1. Then w (a)u∗→ α does not hold.
(Suppose otherwise that w (a)u∗→ α holds. Then w(a)u∗is in the variable c t x maintained by C before T starts. As a result, when T reads a, Pa sees w (a)u∗∈ ctxCand thus returns
a value with a version number no smaller than that u∗, a contradiction.) We move w (a)u∗ and its successors of relation→ after α. The resulting ordering is still in A. We then put T immediately afterα.
Now we continue in the case where T reads at least two different objects. We consider Lemma 9 as a property of any read-transaction. Based on Lemma 9 and t o1, we construct another
ordering t o2∈ A as follows. For any r (a)u ∈ RT, consider Wube the set of such write w (o)v∗
that (1) in t o1, some write w (a)u∗is after w (a)u and w (o)v∗is after w (a)u∗and (2) r (o)v∗∈
RT. If Wu= , then we do nothing for r (a)u; otherwise, we let w(a)u∗be the first write of a
after w (a)u in t o1. We then augment Wuby adding the precedence of each element according
to relation→, and we do this until no more write after w(a)u∗in t o1can be added. Let ss
be the subsequence of t o1which contains all writes in Wu. We move ss immediately before
w (a)u∗.
Below we verify that the resulting ordering t ou (after the construction for r (a)u) falls in A.
By the construction based on relation→, tou still respects relation→. Thus we only need
to verify that C ’s read-only transactions before T can be placed in t ou. We know that in t o1,
all C ’s read-only transactions before T can be placed. Then while moving ss, we may move some of C ’s read-only transactions as well. Namely, for any t o1, given a way to put all C ’s
read-only transactions before T so that they are legal, we include in Wu the last read-only
transaction r t xl astdone by C before T that is put after w (a)u∗; then we still augment Wuby
adding the precedence of each element according to relation→ and stop the addition when no more write or read-only transaction after w (a)u∗in t o1can be added. Now consider ss
as the subsequence of t o1which contains all writes and read-only transactions in Wu. Since
the resulting t ourespects relation→. Thus if ss includes any read-only transaction, then in tou,
the position of the read-only transaction is still legal. In addition, C ’s read-only transactions that are put before w (a)u∗remain unchanged. Therefore, t ou finds a way to place all C ’s
read-only transactions before T and falls in A.
Since ss is only a subsequence of t o1, the move of ss creates no new pair w (a)u and w (o)v∗
such that r (a)u, r (o)v∗∈ RT and w (o)v∗is after w (a)u∗and w (a)u∗is after w (a)u for some
w (a)u∗. Then after a finite number of moves, we can construct an ordering t o2∈ A such
that for any r (a)u∈ RT, Wu= . We now turn to the placement of T in to2. Letα be C’s last
transaction before T . Letβ be C’s first write after T . Let wl ast be the last write in t o2that
corresponds to some read in T . Since during the construction of t o2, we move the positions of
some read-only transactions as well, after the construction of t o2, we have also constructed