• No se han encontrado resultados

3- Final de estacionamiento

6.4 INTERPRETACIÓN DE ZONAS AZULES

We present below a suite of algorithms,A , for fast read-only transactions. To comply with our Theorem 7, we restrict all transactions to be read-only and updates to be outside transactions (or equivalently be considered as single-object write transactions). The goal ofA is to better understand our Theorem 8. Theorem 8 shows that fast read-only transactions are visible. The intuition of Theorem 8 is that after a fast read-only transaction T , servers may need to communicate the information of T among themselves. However, it is not clear when such

communication occurs. The COPS-SNOW [44] algorithm shows that the communication can take place during one client request of write. A below shows that the communication can actually take place outside any client request of write and asynchronously. Different from COPS-SNOW where a value written is visible immediately after the write,A guarantees only eventual visibility.

Algorithm 8 Client-side read/write algorithms

1: local variables

2: l c, logical clock

3: c t x, context

4: end local variables

5: functionWRITE(ob j , v al ) 6: Identify server S by ob j 7: c t xS, l cS← S.write(lc, ctx, ob j, val) 8: update_lc(l cS) 9: update_context(ob j , l cS, c t xS) 10: return OK 11: end function

12: functionREAD(ob j s) 13: t x I D← generate_txID()

14: f i xedC t x← ctx

15: for ob j in ob j s do

16: v al , ver , c t xS, l cS← S.read(lc, f i xedC tx, ob j, txID) 17: save v al to v al s 18: update_lc(l cS) 19: update_context(ob j , ver , c t xS) 20: end for 21: return v al s 22: end function Protocol

We describe first the data structure which each process maintains. All processes maintain locally their logical timestamps and update their timestamps whenever they find their local ones lag behind. They also move their logical timestamps forward when some communication with other processes is made. (The function call inA is update_lc of which the details are omitted for the simplicity of presentation.) Every client additionally maintains the causal dependencies of the current transaction (i.e., the transactions each of which causally precedes the current one). The maintenance of causal dependencies can be done in a similar way as in COPS [35] and COPS-SNOW [44]. (Our algorithmA maintains causal dependencies in variable c t x by function calls of update_context and c t x.update. The details are the same as COPS [35] and COPS-SNOW [44] and thus omitted.) Every client is able to generate transaction identifiers (by a function call of generate_txID inA ). Every server needs to store the causal dependencies which a client passes as an argument during its write. Every server additionally

Algorithm 9 Server-side read/write algorithms

1: local variables

2: l c, logical clock

3: vi s, visible versions in tuples <ob j , ver >

4: ol d T x and cur r T x, storage of tuples <ob j , ver , c t x, t x I D> for each object

5: end local variables

6: functionWRITE(l cC, c t xC, ob j , v al ) 7: update_lc(l cC)

8: c t x← the context of ob j with the highest version in the storage

9: c t x.update(c t xC) 10: update_storage(ob j , v al , l c, c t x) 11: return c t x, l c 12: end function 13: functionREAD(l cC, c t xC, ob j , t x I D) 14: update_lc(l cC) 15: if t x I D∈ oldT x then

16: ver← the version identified by txID in oldT x

17: else

18: vvi s← the highest version of ob j in vi s 19: if <ob j , v> is in c t xCand v> vvi s then

20: ver← v

21: else

22: ver← vvi s

23: end if

24: end if

25: save <ob j , ver , c t xC, t x I D> to cur r T x

26: v al← the value identified by ver of ob j in the storage

27: c t x← the context identified by ver of ob j in the storage

28: return v al , ver , c t x, l c

29: end function

maintains a data structure called ol d T x for each object stored.

We next sketch how writes and read-only transactions are handled. The full algorithms are shown in Algorithm 8 and Algorithm 9.

• Every client sends its logical timestamp as well as causal dependencies when requesting a write of object ob j . A server uses the server’s updated logical timestamp as the version ver of the value v al written, stores the version and the value along with the causal dependencies c t x (by a function call update_storage(ob j , v al , ver , c t x) in Algorithm 9), and returns the version number to the client.

• Every client C sends its logical timestamp when requesting a read-only transaction t x. A server first searches t x in ol d T x, and returns a pre-computed value according to entry t x in ol d T x if t x∈ oldT x. Otherwise, a server returns some value previously observed

Algorithm 10 Server-side asynchronous check

1: local variables

2: Same as in Algorithm 9 3: end local variables

4: when all versions of ob j below ver are in vi s, invoke async_check 5: procedureASYNC_CHECK(ob j , ver )

6: identify c t x by ob j , ver in the storage 7: for ob jd, verdin c t x do 8: identify server D by ob jd 9: ol d T xD, l cD← D.async_checkVis(ob jd, verd, l c) 10: update_lc(l cD) 11: save ol d T xDto ol d T x as follows: 12: for t x I D in ol d T xDdo 13: if t x I D∉ oldT x then

14: get tuple <ob jd,∗, ctxd, t x I D> from ol d T xD

15: identify version vpr evas the highest version below ver of ob j in the storage 16: if <ob j , v> is in c t xdand v> vpr ev then

17: save tuple <ob j , v,−, txID> into oldT x

18: else

19: save tuple <ob j , vpr ev,−, txID> into oldT x

20: end if

21: end if

22: end for

23: end for

24: for t x I D in cur r T x do

25: if <ob j , v,∗, txID> is in cur r T x and v < ver then

26: move the tuple identified by t x I D from cur r T x to ol d T x

27: end if

28: end for

29: save< ob j,ver > into vi s 30: end procedure

31: functionASYNC_CHECKVIS(ob jd, verd, l cS) 32: update_lc(l cS)

33: when< ob jd, verd> is in vi s, return oldT x, lc 34: end function

by C or some value marked as “visible”.

We here sketch how ol d T x is maintained and communicated (during asynchronous propaga- tion). The full algorithm is shown in Algorithm 10.

• After a server S responds to a client’s write request of value w for some object o, S sends a request to every server which stores some value v such that w (o)vw (o)w . Any server responds such request with its local ol d T x when v is marked as “visible”.

• After S receives a response from all servers which store some value that causally precedes w , S stores their ol d T xs into S’s local one, chooses a value w∗which is written before w9

Any read-only transaction is stored and marked as “current” during its execution at any server. A “current” transaction T is put in ol d T x when some value w is “visible” and T has returned a value written before w of the same object.

Proof of correctness

Our suite of algorithms A above provides fast read-only transactions. As every message eventually arrives at its destination (and therefore asynchronous propagation eventually ends),A satisfies progress. As asynchronous propagation carries transaction identifiers, A is visible. In what follows, we show thatA satisfies causal consistency.

In Algorithm 9, when a server stores a value, the server chooses a version number strictly greater than all values of the same object previously written. Therefore in addition to relation , we also enforce an ordering on all writes of the same object by their version numbers. In what follows, we say that two writes w1→ w2, if (1) w1is of a lower version number than w2

and w1, w2write the same object; or (2) w1w2; or (3)∃ some write w3such that w1→ w3

and w3→ w2. We first show a property for any read-only transaction in Lemma 9. We then

prove the correctness ofA based on Lemma 9.

Lemma 9 (A correct snapshot for visible fast read-only transactions). Let T be any transaction that contains at least two reads. Given any two reads r (a)u, r (o)v∗∈ RT, if∃w(a)u∗such that

w (a)u is of a lower version number than w (a)u∗, then w(a)u∗→ w(o)v∗does not hold.

Proof of Lemma 9. By contradiction. Suppose that r (a)u, r (o)v∗∈ RT and w (a)u∗→ w(o)v∗

holds. According to Algorithm 9, there are three possibilities when the server Pothat stores

object o returns v al= v∗at t x I D= T : 1. t x I D∈ oldT x;

2. t x I D ∉ oldT x but for object o, ctxC specifies a version v, higher than the highest

version vvi sin vi s of the same object;

3. t x I D∉ oldT x; and for object o, ctxCdoes not specify a version or any specified version

v is lower than vvi s.

Let us examine each possibility. First, we look at the second possibility. Then< o,v >∈ ctxC,

v corresponds to v al= v∗at Po, and v> vvi s. The maintenance of variable c t x maintains

9In order to choose a value correctly, in the algorithm, S actually sends a request after all values written before

w (of the same object) are marked as “visible”. Also, S does not choose a value for some t x which S has chosen

the precedences of a transaction (a single-object write transaction or a read-only transaction) according to relation→. We sometimes also say a write is in ctx if the pair of the corresponding object and version number is in c t x. By the maintenance of c t xC, since w (a)u∗→ w(o)v∗,

then w (a)u∗∈ ctxC. However, according to Line 16 of Algorithm 10 and Line 19 of Algorithm

9, if w (a)u∗∈ ctxC, then the server Pa which stores object a is unable to return v al= u of

which the version is lower than that of u∗at t x I D= T .

Next, we look at the third possibility. Then t x I D∉ oldT x. In addition, for object o, ctxC

does not specify a version or any specified version v is lower than vvi s; in either case, vvi s

corresponds to v al= v∗at Po. According to Line 4 and Line 33 of Algorithm 10, when T reads

o at Po, u and u∗are visible (i.e., in vi s) at Pa. Clearly, if T reads a at Pa before Pareplies to

as ync_checkV i s(a, u∗,∗), then Pasends T to Poduring as ync_checkV i s(a, u∗,∗) and Po

could have T ∈ oldT x when T reads o at Po, which gives a contradiction. Therefore, T must

read a after Pareplies to as ync_checkV i s(a, u∗,∗), i.e., after u∗is visible. Thus according to

Line 19 of Algorithm 9, Pamust find T ∈ oldT x when T reads a. Similarly, due to Pa’s reply to

Po’s call of as ync_checkV i s(a, u∗,∗), the first time when Pareceives T must be also after u∗

is visible (while Painvokes as ync_check(a, u1) for some version u1after the version of u∗).

Then according to Line 16 of Algorithm 10, Papre-determines a version no smaller than the

version of u∗for T , which contradicts the return value v al= u of Pa.

Finally, we look at the first possibility. t x I D∈ oldT x. Since Popre-determines v al= v∗for T ,

then either c t xCspecifies v∗for object o or v∗is visible the first time when Poreceives T . The

two cases are similar to the second and third possibilities, leading to contradictions against the return value of Pa. As a result, we conclude that if w (a)u∗→ w(o)v∗holds, then T cannot

have both r (a)u and r (o)v∗, which is equivalent to Lemma 9.

Proof of causal consistency. By contradiction. Suppose that some execution E violates causal consistency. Then in E , some client C ’s local history cannot be totally ordered to satisfy Definition 7. Clearly, without any read-only transaction, we can order all writes in a way that respects relation→ defined previously (which includes the relation of causalitybetween any two writes). Therefore C does at least one read-only transaction. In order to incorporate C ’s read-only transactions, we extend the relation→ defined previously. Consider the set T X of transactions that consist of all writes in E and all C ’s read-only transaction. For any two transactions t x1and t x2, we say that t x1→ tx2, if (1) t x1and t x2are two writes, t x1is of a

lower version number than t x2and t x1, t x2write the same object; or (2) t x1t x2; or (3)

some t x3∈ T X such that tx1→ tx3and t x3→ tx2.

Let t owbe any ordering that respects relation→. We then add C’s read-only transactions in

t ow one by one. Since we suppose that E violates causal consistency, we let T be the first

read-only transaction such that some t ow exists which can include C ’s read-only transactions

before T but for any t ow, C ’s read-only transactions up to and including T cannot be placed

Let A be the set of such ordering t owthat can include C ’s read-only transactions before T and

let t o1be any ordering in A. We first show that T must read at least two objects, the proof

of which is by contradiction. Suppose otherwise that RT = {r (a)u}. Let the last transaction

(which can be a read-only transaction or a write) done by C before T isα. Let the first write done by C after T isβ. Then in any to1 where all C ’s read-only transactions before T are

included, either (1) w (a)u is beforeα, or (2) β is before w(a)u, or (3) w(a)u is between α and β. In the third case, we put T immediately after w(a)u. In the second case, β → w(a)u does not hold. (Suppose otherwise thatβ → w(a)u holds. Then the logical timestamp l1which

the client of w (a)u receives from Pa during w (a)u is higher than the logical timestamp l2

which C receives from the server that stores the object written byβ during β. However, when T reads a, the logical timestamp which C receives from Pais at least l1, and as a result, the

value of l2≥ l1, a contradiction.) We moveβ and its successors of relation → after w(a)u. The

resulting ordering is still in A. We then put T immediately after w (a)u. In the first case, there are two possibilities: (i) between w (a)u andα, there is some write w(a)u∗; (ii) between w (a)u andα, there is no write w(a)u∗. For the latter, we put T immediately afterα. For the former, let w (a)u∗be the first write of object a after w (a)u in t o1. Then w (a)u∗→ α does not hold.

(Suppose otherwise that w (a)u∗→ α holds. Then w(a)u∗is in the variable c t x maintained by C before T starts. As a result, when T reads a, Pa sees w (a)u∗∈ ctxCand thus returns

a value with a version number no smaller than that u∗, a contradiction.) We move w (a)u∗ and its successors of relation→ after α. The resulting ordering is still in A. We then put T immediately afterα.

Now we continue in the case where T reads at least two different objects. We consider Lemma 9 as a property of any read-transaction. Based on Lemma 9 and t o1, we construct another

ordering t o2∈ A as follows. For any r (a)u ∈ RT, consider Wube the set of such write w (o)v∗

that (1) in t o1, some write w (a)u∗is after w (a)u and w (o)v∗is after w (a)u∗and (2) r (o)v∗∈

RT. If Wu= , then we do nothing for r (a)u; otherwise, we let w(a)u∗be the first write of a

after w (a)u in t o1. We then augment Wuby adding the precedence of each element according

to relation→, and we do this until no more write after w(a)u∗in t o1can be added. Let ss

be the subsequence of t o1which contains all writes in Wu. We move ss immediately before

w (a)u∗.

Below we verify that the resulting ordering t ou (after the construction for r (a)u) falls in A.

By the construction based on relation→, tou still respects relation→. Thus we only need

to verify that C ’s read-only transactions before T can be placed in t ou. We know that in t o1,

all C ’s read-only transactions before T can be placed. Then while moving ss, we may move some of C ’s read-only transactions as well. Namely, for any t o1, given a way to put all C ’s

read-only transactions before T so that they are legal, we include in Wu the last read-only

transaction r t xl astdone by C before T that is put after w (a)u∗; then we still augment Wuby

adding the precedence of each element according to relation→ and stop the addition when no more write or read-only transaction after w (a)u∗in t o1can be added. Now consider ss

as the subsequence of t o1which contains all writes and read-only transactions in Wu. Since

the resulting t ourespects relation→. Thus if ss includes any read-only transaction, then in tou,

the position of the read-only transaction is still legal. In addition, C ’s read-only transactions that are put before w (a)u∗remain unchanged. Therefore, t ou finds a way to place all C ’s

read-only transactions before T and falls in A.

Since ss is only a subsequence of t o1, the move of ss creates no new pair w (a)u and w (o)v∗

such that r (a)u, r (o)v∗∈ RT and w (o)v∗is after w (a)u∗and w (a)u∗is after w (a)u for some

w (a)u∗. Then after a finite number of moves, we can construct an ordering t o2∈ A such

that for any r (a)u∈ RT, Wu= . We now turn to the placement of T in to2. Letα be C’s last

transaction before T . Letβ be C’s first write after T . Let wl ast be the last write in t o2that

corresponds to some read in T . Since during the construction of t o2, we move the positions of

some read-only transactions as well, after the construction of t o2, we have also constructed

Documento similar