9.1 Descripción de procesos
9.1.1. Planta de Tratamiento de Agua Potable.
Maintaining disconnected transactions costs client local resources in two main areas: persistent memory space for recording transaction history and disk space for storing shadow cache files. To support long-lasting disconnected operation sessions involving a large number of disconnected transactions, it is necessary for the transaction system to reduce such client resource costs. Our strategy is to cancel those redundant transactions that no longer have any impact on the file system state. Note that transaction cancellation is only an optimization in realizing the IOT model in Coda, which does not change the overall IOT consistency model.
An important phenomenon in disconnected operation is that many file access operations cancel the effect of previous ones. For example, in the typical “edit ! compile ! debug” cycles during software development, astoreoperation often overwrites a previous one and a remove operation is likely to offset an earlier create operation. This is exploited by the non-IOT Venus (i.e, the Coda Venus without the IOT extension) to cancel unnecessary records from the CML during disconnected operation, and it results in significant resource savings [26]. The IOT implementation extends the same principle to a larger granularity to cancel those pending transactions that no longer have any effect on local client state. Such transaction cancellation has two important benefits. First, it frees up client resources such as persistent storage space used by the transaction internal representation and the disk space occupied by shadow cache files. Second, it reduces the amount of server/client communication traffic as well as server computation time needed for transaction validation and commitment. Evaluation results presented in Chapter 9 will show that these benefits are substantial.
Basic Mutation Cancellation Behaviors To understand how a pending transaction can be cancelled, let us first consider two kinds of basic mutation cancellation behaviors. The first
kind of mutation cancellation is called overwriting which means that the effect of a mutation operation is eliminated or made obsolete by a later mutation operation. For example, the effect of “store foo” can be overwritten by a later “store foo” or “remove foo”. The second kind of mutation cancellation is called offsetting which means that the effect of a pair of mutation operations offset each other so that their combined result produces no effect at all, i.e., a no-op. For example, a “create foo” can be offset by a later “remove foo” and a “rename foo bar” can offset a previous “rename bar foo”. The basic principle behind mutation cancellation is that when the effect of a disconnected mutation operationop1 is eliminated by a later operationop2, there is no point of preservingop1 in the mutation log and propagating its effect via reintegration. This is because afterop1 is replayed on the corresponding server during the reintegration, the subsequent replay ofop2 by the same reintegration process will immediately nullifyop1’s effect.
Transaction Cancellation Criteria Intuitively, cancelling a pending transaction must pre- serve certain correctness conditions and we adopt the following three criteria to decide whether a pending transaction can be cancelled or not.
The first criterion requires that a pending transaction T must be obsolete before it can be cancelled. T is an obsolete transaction if none of its effects are visible on the client local state due to subsequent transactions. For example, a make transaction compiling a file work.c, i.e., Tmake =fR(work.c), W(work.o)g, can be made obsolete by another make transaction performing the same compilation. As another example, a pair of transactionsTcreate=fcreate foo, create bargandTremove =fremove foo, remove bargare both made obsolete by each other. In essence, this criterion ensures that the cancellation ofTdoes not affect the final outcome when all the disconnected transactions are committed to the servers.
The second criterion requires any cancellable transaction to be covered. It means that the removal of a transactionTfrom the local transaction history will not affect the vali- dation outcome of other disconnected transactions. Although an obsolete transactionT
does not leave behind any visible effects, it is still capable of influencing the validation outcome of other pending transactions. Consider the following example. A transaction
T1 =fR(work.c), R(work.h), W(work.o)gcompiled an object filework.o that is later used by another transaction T2 = fR(work.o), W(work)g to build an executable file work. A third transaction T3 = fR(work.c), R(work.h),
W(work.o)gre-compiledwork.oafter some updates are made towork.cand ren- deredT1 obsolete. However, cancelling transaction T1 will remove the indirect depen- dency between transactionT2and fileswork.candwork.hfrom the local transaction history, thus possibly affecting the validation outcome ofT2. Ifwork.his updated on
though its result indirectly depends on the old version ofwork.h. PreservingT1 in the local transaction history is necessary to allow the transaction validation process to follow the dependencies and appropriately invalidateT1, and thereafterT2.
Even if a transaction Tis both obsolete and covered, it does not necessarily mean that we can cancel it. Suppose that Tcontains just one operationmkdir home/fooand it is made obsolete by another transaction T’ containing rmdir home/foo. Both
T and T’ must be cancelled together because cancellingT alone will cause failure in propagatingT’. Formally, when one ofT’s mutation operations offsets another belonging to transaction T’, we say that the two transactions have an offsetting relation between them and the two must be cancelled together. This is because cancelling either of them while leaving the other behind would result in a non-equivalent global state after the remaining transactions are validated and committed to the servers. Therefore, the third cancellation criterion forTrequires that all the transactions that have an offsetting relation withTcan be also cancelled together withT.
Note that it is quite possible that some of the cancelled transactions may have been inval- idated if they had remained in the transaction history. For example, consider the following two disconnected, offsetting transactions T1 = fmkdir home/foog and T2 = frmdir
home/foog. Suppose that a new objecthome/foohas been created on the servers during the disconnection. If T1 and T2 are not cancelled, they will be invalidated because of the
update/update conflict onhome/foo. Thus, transaction cancellation increases the chances for the complete history of disconnected transactions to pass validation and commit their results to the servers. Formally, if the local transaction history for a disconnected operation session is Hdand the set of cancelled transactions is
P
, then it is possible that (Hd; P
) can be validated whileHd can not.
Checking the Cancellation Criteria The automatic checking of the first cancellation crite- rion can be implemented using Coda’s original mechanism for cancelling inferred transactions. For each non-transactional, disconnected mutation operation, Venus iterates through the entire CML of the corresponding volume in reverse chronological order to search for and remove any log record that is overwritten or offset by the new mutation. In the case of offsetting, the new mutation operation itself will also be removed from the CML [26]. Similarly, for every disconnected mutation operationopT performed by a transactionT, Venus searches the corresponding CML and marks all the log records that are either overwritten or offset byopT, includingopT itself. If all the records inTML(T)are marked, it means that the result ofTis no longer visible on the client local state and Venus then marksTas obsolete.
Validating the second cancellation criterion for a pending transactionTcan be performed by checking all the live transactions that read fromT. We useread-from(T)to denote the
setfT’jThe execution ofT’reads an object whose value is written byT.g. Tcan be marked as a covered transaction if every transactionT’2read-from(T)satisfies the condition that
R(T’)is a super-set ofR(T). Because the readset of a transaction is always a super-set of its writeset, this is sufficient to guarantee that the removal ofTfrom the local transaction history will not affect the GC validation outcome of any other pending transactions.
The third criterion is more difficult to validate because it requires a group of transactions to satisfy a certain relationship. A key observation is that the offsetting relation induces an equivalence relation among all the disconnected transactions. Pending transactions belonging to the same partition of the equivalence relation must be cancelled together when all of them are marked as obsolete and covered. Thus, this problem can be solved by representing the offsetting relation with a simple graph and identifying fully connected components. Although detecting cancellable isolation-only transactions is a more complicated process than that of inferred transactions, the benefits of transaction cancellation far outweighs the detection cost. Intra-Transaction Optimization Typical transactions such as make usually perform a lengthy computation accessing a large number of objects. Such long transactions often cre- ate temporary objects and later remove them. The same cancellation mechanism used for inferred transactions by Venus is applied within the scope of a single transaction to remove the unnecessary records from the TML.
4.3
Merging Local State with Global State
This section discusses how to realize the central part of the IOT consistency model, ensuring global transaction isolation during transaction propagation. Our discussion concentrates on the actions performed by the transaction system when a disconnected client is able to re-establish communication with relevant servers. We first outline a general framework of synchronizing the local client state with the global server state, establishing a broader perspective for the discussion of the transaction propagation process. We then describe how the results of disconnected transactions are incrementally propagated to the servers, i.e., validated and committed or resolved one at a time. Finally, details about transaction validation and transaction commitment are presented. Due to its complexity, the discussion of transaction resolution is deferred to the next two chapters.