In this section I will discuss what I will call state-management systems. These are systems whose purpose it is to replicate state among components by extracting state from one component and reproducing it in a replica.
Roussev’s Programming Patterns [Rou03] focuses on the issue of abstraction flex- ibility, the ability to share a wide range of programmer defined abstractions. He demonstrates that the logical structure of a program can be dynamically determined and the state of its objects accessed by identifying common method signature pat- terns. A small set of patterns can often be used to identify the methods for reading, writing, and modifying the state of a large number of objects. Programmer-defined handlers can then be provided to map from generic operations understood by the col- laborative infrastructure (e.g., to read, write, or compare state) to the corresponding pattern-specific operations. These generic operations can then be used to access and communicate (e.g., replicate) the state of the objects. An event mechanism is required to determine when and how objects have changed state. If the objects already sup- ply such an event mechanism, it can be mapped to the infrastructure’s generic event mechanism. Otherwise, through minimal code changes to the application, events can be added that essentially encode state-changing method calls so that they can be communicated to the infrastructure.
Roussev’s work also provides the infrastructure for Suite-like coupling of state among related objects (e.g., replicas). Thus, coupling specifications can be used to determine when state changes are communicated among objects7. In Roussev’s work,
however, the objects are arbitrary; they are not the automatically generated editor UIs of Suite. Thus, Roussev’s infrastructure can be used in the context of arbitrary 7Roussev’s thesis did not address concurrency problems that arise when objects are modified simultaneously by multiple users.
user interfaces.
Roussev’s programming patterns were implemented in Java using its reflection capabilities, so it can be applied to compiled code, assuming that adequate event mechanisms are already available in that code. His methodology could be applied to any programming language supporting reflection.
Chung’s log-based collaborative infrastructure [Chu02] takes a different approach to state management. Chung’s approach is to establish replica state by logging and replaying communications between the layers of a layered application. His infras- tructure supports state establishment by both direct state transfer (and update) and command sequences.
Chung’s infrastructure can be used to replicate components of any of the layers of an application (e.g., model, view, window, or screen). It can therefore be used to implement centralized or replicated architectures or hybrid architectures with both centralized and replicated sub-architectures. It also supports dynamic transitions among these architectures, by which the collaborative infrastructure can adapt to dynamic changes in the structure, needs, and performance characteristics of a collab- orative session.
One of Chung’s goals was to be able to add collaborative capabilities to as wide a range of applications as possible, including existing single-user applications. To accomplish this, he requires inter-layer protocols to be mapped to an abstract I/O protocol specified by his infrastructure. Another of his goals was that his infrastruc- ture be as unaware of the semantics of the application protocols as possible. This goal is in conflict with the need to manage non-determinism and non-idempotent outputs and to reduce command sequence lengths. Thus, for correctness and efficiency, the abstract I/O protocol contains facilities for tagging communications with semantic information. Chung uses heuristic techniques such as message counting to determine
when a replica is in the desired state. However, given the typical scenario of existing applications with incompletely-specified semantics and uncooperative protocols, there usually remains some uncertainty with respect to the ability to deterministically put components into the correct state without duplication of non-idempotent outputs, and to bound the command sequence length. This is especially true when replicating the model layer for a replicated architecture. Once a component is in an incorrect state, there is often no way to ensure that future states are correct. While satis- factory results can usually be obtained with enough experimentation and tweaking of the protocol mapping, this is an arduous process. Chung’s dissertation is a good exposition on what can go wrong, and how to address the issues when you have no other choice.
By and large, it is much easier to set the state of a view than a model. Views have a specific, limited purpose – to define the user interface of an application in terms of some standard UI technology. The size of view state is bounded in practice by this limited purpose. There is also little reason for extraneous inputs, since the user interface should ideally be deterministically defined by the application. When such inputs exist (e.g. different font sets for different instantiations of a window system), they are usually inputs to the standard underlying UI technology, a well-defined sub- component for which work-arounds can most often be found. Such work-arounds can be reused from view to view. It is not the purpose of a view to affect its external environment, other than to produce transient user interfaces on a display used for user/machine communication, so there is little reason for a view state machine to emit non-idempotent outputs external to the display. Finally, because of its specific, limited purpose, view code can often be easily transported and executed in limited, replicated contexts.
purpose varies from application to application, their state size is not bounded. They often have context-specific inputs (e.g., file systems and environment variables, rep- resented by Resources in Figure 2.11) that may affect their state transitions in non- deterministic ways from context to context. Models are also often required to emit non-idempotent outputs to affect their environments in various ways. In contrast to view code, model code is generally more difficult to export and execute in a lim- ited, replicated context, because it may have arbitrary dependencies on resources and infrastructure.
Figure 2.11: Centralized and Replicated Architectures
It is largely because of the state-management difficulties discussed in this section that I have chosen a centralized architecture and functional views for Concur. The centralized architecture eliminates the need to replicate semantic models. Functional views simplify state management and facilitate determinism and avoidance of non- idempotent outputs (side effects, in functional language terminology).
Chapter 3
Entity Taxonomy
3.1
Introduction
In this section I will first discuss the distinction between model and view state implied by the MVC paradigm. This discussion will have a unifying effect, suggesting that model and view state should be handled similarly. (Since I will be advocating treating view state as a model, I will use the termsuser interface (UI) domain statefor view state andapplication domain state for model state, to avoid confusion.) Then I will introduce a different classification of the unified state, based on properties such as entity roles, container and resource requirements, and desired location and migration characteristics. The end result will be an entity taxonomy that will point the way to enabling most state to be efficiently shared and a wide range of divergence possibilities to be exposed to users in an understandable manner.