5. LA MUÑECA DE TRAPO OLVIDADA EN EL PUENTE ANÁLISIS DE LA
5.1. Descifrando los componentes del pensamiento práctico
5.1.5. Conocimientos, castillos ladrillo a ladrillo
5.1.5.1. El rol docente
A list of effective operations is equivalent in effect to a list of editing scripts derived by text differentiation algorithms. A list of effective operations is one alternative list of editing scripts used to transform a document from its initial state to its final state, while a list of editing scripts derived by text differentiation algorithms is the shortest list of editing scripts used to transform the document from the same initial state to the same final state. The difference between the two lists is that the former list preserves the intentions of user-issued actions while the latter list attempts to reconstruct actions after the fact and has little chance to preserve the intentions of user-issued actions. For the sake of comparison, both effective operations and editing scripts are represented as character-based.
For the example in 3.12, the document initially contained a string xyz and was transformed to another string xc123by a list of user-issued editing operations stored in log L = [O1, O2, O3, O4, O5]. When L is compressed, Lc = COMET(L) = [O43,
O2
1] where O 4
3 = Del[1, 2, yz] and O 2
1 = Ins[1, 4, c123] are effective operations. If
effective operations are represented as character-based, then Lc = [EO1, EO2, EO3,
EO4,EO5,EO6] where EO1 =Del[1, 1, y], EO2 = Del[1, 1, z], EO3 =Ins[1, 1, c],
EO4 = Ins[2, 1, 1], EO5 = Ins[3, 1, 2], and EO6 = Ins[4, 1, 3]. The editing graph
for transforming stringxyz to string xc123 is shown in Figure 3.17.
x y z 0 1 2 3 4 0 1 2 3 (4, 5) (0, 0) x c 1 2 3
(A) Shortest Editing Script: 2D, 3D, 3Ic, 3I1, 3I2, 3I3
x y z 0 1 2 3 4 0 1 2 3 (4, 5) (0, 0) x c 1 2 3
(B) Another Editing Script: 1D, 1Ix, 2D, 3D, 3Ic, 3I1, 3I2, 3I3
Figure 3.17: Editing graph transforming string xyz to string xc123
According to the text differentiation algorithm presented in [93], the shortest edit- ing script for transforming string xyzto string xc123 contains five editing operations shown in Figure 3.17(A): 2D(delete character y), 3D(delete character z), 3Ic(insert character c), 3I1(insert character 1), 3I2(insert character 2), and 3I3(insert char- acter 3). In this example, the list of shortest editing scripts derived by the text differentiation algorithm accidentally coincides with the list of effective operations [EO1, EO2, EO3, EO4, EO5, EO6]. As pointed out, the list of effective operations
and the list of shortest editing scripts are two of many alternative paths in trans- forming a document from its initial state to the final state. These two paths may not
126
necessarily be the same because the user may not necessarily choose the shortest path to transform a document from its initial state to the final final state. For example, in Figure 3.17(B), the user may choose another path that is different from the shortest path in Figure 3.17(A) to transform string xyzto stringxc123. That path consists of the following list of effective operations: 1D(delete character x),1Ix(insert character x),2D(delete charactery),3D(delete characterz),3Ic(insert characterc),3I1(insert character 1), 3I2(insert character 2), and 3I3(insert character 3).
Nevertheless, the scale of a list of effective operations is comparable to that of the shortest list of editing scripts derived by text differentiation algorithms with both the size of the list and the number of operations within the list a complexity ofO(m+n) wheremis the size of the source string and nis the size of the destination string. For the example in Figure 3.17, the upper bound of the number of operations is 3 + 5 = 8 and the upper bound of the size is 3 + 5 = 8 bytes if deletion/insertion of a character is represented by one byte. As stressed in Section 3.2, reducing the scale of logs to the same order of the scale of editing scripts derived by text differentiation algorithm is important for an operation-based merging process to outperform the corresponding state-based merging process.
It should be highlighted that the execution of the log compression algorithm does not necessarily contribute to the times of an operation-based merging process because the compression algorithm can be executed progressively during editing. For instance, it can be executed in the background periodically or with some predefined thresholds during editing. On the contrary, the execution of text differentiation algorithms directly contributes to the times of a state-based merging process because they have to be executed at the time of merging in order to derive correct deltas for state-based textual merging algorithms to use.
Generally speaking, the COMET compression algorithm is able to compress a log more significantly if operations in the log are more localized because in this case
operations are more likely to be overlapping or adjacent. In reality, if a user writes a chapter for a book, or a component for a software system, it is very likely that editing operations performed by the user are localized. In this case, compression is essential because massive editing operations may be generated and the scale of the log storing these operations can become very large. Furthermore many operations in the log can be redundant in the sense that they do not contribute to the final state of the document at all. Many others can be partially redundant in the sense that their effects are only partially reflected in the final state of the document. This is because constructing a document from scratch requires a lot of try-and-failures and alternative explorations. As a result, the log can be compressed by the COMET
algorithm very significantly in this case. By contrast, if a user revises a chapter in a book or a component in a software system, it is less likely that editing operations performed by the user are localized. In this case, compression is less essential because relatively small number of editing operations will be generated during the revision process. These small number of operations are scattered across the document and most of them have contributed their effects or partial effects to the final document. As a result, the log will be compressed by the COMETalgorithm less significantly in this case.