2. TEXTOS EN TRAMITACIÓN
2.7 PREGUNTAS PARA RESPUESTA ESCRITA
2.7.2 TRANSFORMACIÓN EN PREGUNTAS PARA RESPUESTA ESCRITA
While the current specification of the ABI precisely defines some aspects of the interface such as the input and output of functions or which code to use for exception handling, it does not define the required behavior under concurrent execution. For example, which accesses to shared state by a runtime library are allowed or required for a given sequence of calls to the ABI functions, and how
does a compiler express language-level requirements in terms of ABI calls? As a result, it is unclear for both compiler and runtime library implementers how to use or implement the ABI correctly.
Looking at the C++ TM specification, the basic problem to be solved is that a runtime library has to pick a TSO (Transaction Synchronization Order) for transactions expressed via the ABI, but that the TSO choice must be consistent with the rest of the happens–before order, including the relation between ac- cesses in different transactions. Also, the compiler must express happens–before constraints by using the ABI, and how this can happen needs to be specified.
We want the runtime library to be able to choose TSO because we want to keep the implementation of concurrency control in the library. We also do not want to require the library to choose TSO early (i. e., when starting a transaction) because this would disallow optimistic concurrency control and significantly limit the TM algorithms that can be used.3 Thus, a runtime library might have to speculatively execute transactions and abort and restart them when the tentative TSO choice changes.
TM and the C++ as–if rule. However, speculative execution must be harmless and invisible in terms of the allowed behavior of a program. For C++ programs, this requires equality to the behavior of an abstract machine executing the program according to what is summarized as the as–if rule in the C++ standard.
Code that is unsafe according to the TM specification (e. g., accesses to volatile variables, or I/O) often cannot be speculatively executed because it represents actions of the abstract machine and cannot be rolled back or made invisible under the as–if rule.4 For such code, the runtime library then has to select a position in TSO pessimistically and let the transaction become irrevo- cable. Unsafe code is typically not instrumented by the compiler, so it is also hard to isolate other code that is running concurrently. Together, these are the reasons for the existence of the serial–irrevocable mode in the ABI.
Implementing the as–if rule correctly is partially specific to the implementa- tion of the environment that is executing the program. For example, it depends on this environment which actions performed by a TM implementation actually count as visible side effects. In our case, it primarily restricts the possibilities for speculative execution: The fewer side effects the environment allows to be contained, the less options for speculative execution the TM implementation has because otherwise side effects would become visible (i. e., according to what the environment defines to be visible) that would not occur when executing a C++ abstract machine.
To illustrate the difficulties associated with a practical definition of what constitutes a visible side effect in an environment, let me discuss one example in more detail: segmentation faults that occur due to misspeculation. Espe- cially in STMs, validating that the memory accesses performed by a transaction 3We also cannot expect that all memory accesses of a transaction are known when start-
ing the transaction because control flow and subsequent memory accesses might be data- dependent. Therefore, a TM cannot choose an optimal TSO early for such transactions.
4The compiler and runtime library can potentially defer the execution of these parts of
code to the commit of the respective transaction. However, this only works in some scenarios and might require complex analysis of the transaction by the compiler (e. g., it can work if there is only output but no input and all output actions can be executed atomically).
represent an consistent, atomic snapshot can be quite costly. It could thus be beneficial to allow the TM runtime library to return speculative results of load operations to transactional code. However, if these results are pointers, deref- erencing an inconsistent value (e. g., a null pointer) can lead to a segmentation fault, which is, on the operating systems that I consider, translated into a signal that is delivered to the signal handlers installed by the program. If the program did not install a signal handler, the program will terminate before it finished execution, which would be incorrect behavior. If it did install a signal han- dler, then incorrect behavior would likely result as well because arbitrary signal handlers cannot be expected to be aware of speculation inside of the TM im- plementation and thus would not know how to handle this segmentation fault. Therefore, the TM would have to install its own signal handler and mask seg- mentation faults that might have occurred due to misspeculation. Furthermore, the TM would have to prevent the application or other libraries from installing a different handler, leading to potential conflicts with these other components. However, even when controlling the userspace signal handling, the segmentation faults would still be visible at the kernel level, for example as part of page fault statistics or perhaps to intrusion detection systems. Such observers are unlikely to know or care about TM misspeculation.
This example shows that misspeculation is difficult to handle once side effects become visible to parts of the system that are not anymore under control of the TM runtime library or the compiler. Trying to contain the visibility of such side effects (and similar effects like nontermination or exceeding resource usage) in C/C++ environments requires solutions [116, 21] that are rather invasive, com- plex, and tightly coupled with other components in the environment. Whereas this might not matter that much in managed enviroments (e. g., a Java virtual machine), it does not seem to be beneficial for C/C++ and first-generation TM support because it would make it harder to provide TMs that are practical on a wide variety of systems.
Constraining speculation. Therefore, it seems to be more beneficial to con- strain speculation and thus limit the effects of misspeculation, at least in the case of first-generation C/C++ TM implementations. The TM runtime library still selects TSO dynamically at runtime, but with restrictions.
To specify the restrictions on the implementations, we have to look at the code that the compiler creates from the transactional source code. The compiler- generated code consists of TM-pure operations, unsafe operations, and calls to TM ABI functions. TM-pure operations are all code and instructions that are either annotated as transaction pure or which the compiler can detect to be safe and not need transactional protection from the TM (e. g., control flow instructions or arithmetic operations on CPU registers). TM-pure operations can thus be speculatively executed (I will define what this means precisely in Section 4.2.3). Unsafe code is all the code that is neither TM-pure nor supported by the ABI, and is always preceded by an ABI call that requests the TM runtime library to switch to an execution mode that supports unsafe code (i. e., serial– irrevocable mode and pessimistically choosing TSO).
When ignoring unsafe code, an execution of the compiler-generated code for a transaction is thus a sequence of TM-pure operations interleaved with calls to TM ABI functions. Figure 4.5 shows a simplified example execution of the
1 // TM−pure code // ABI calls
2 ret =
3 ITM beginTransaction(...);
4 // ret = a saveLiveVariables | ...;
5 // Save stack slots (not shown)
6 long l cntr = (long)
7 ITM RU8(&cntr);
8 // Abort and restart.
9 // ITM beginTransaction() returns a second time.
10 // ret = a restoreLiveVariables | ...;
11 // Restore stack slots (not shown)
12 long l cntr = (long)
13 ITM RU8(&cntr);
14 l cntr = l cntr + 5;
15 ITM WU8(&cntr, l cntr);
16 ITM commitTransaction();
Figure 4.5: An example execution of the transaction of Figure 4.4 with one abort in the transactional read. TM-pure operations are shown on the left and ABI calls on the right.
code that a compiler would generate for the transaction shown in Figure 4.4. The sequence always starts and stops with calls to the ABI begin and commit functions, respectively. Transactions can be aborted and restarted only within calls to ABI functions (see Section 7.2.4 for a discussion of the problems caused by aborting within TM-pure operations). On a transaction restart, control flow is modified so that ITM beginTransaction returns again, which will reexecute the transaction’s code from the beginning.
TM-pure operations and the TM runtime library have to work together (via the ABI) to execute a transaction. To reduce coupling, we do not want to require them to be aware of the specifics of the as–if rule on the both sides of the ABI. Instead, we want to enable the compiler and TM runtime library to separately reason about as–if on their respective side of the ABI.
Table 4.2 shows the high-level guarantees that enable separate reasoning about as–if by the compiler and the TM runtime library. In particular, these guarantees enable the compiler to reason about which code is TM-pure (or can be made TM-pure and how to implement this), without having to know how the TM runtime library picks TSO. In turn, the TM runtime library can reason about the as–if requirements for memory accesses executed by it without having to consider TM-pure code in detail. These guarantees are conservative choices that restrict speculation, but the gained decoupling makes the potential loss in performance worthwhile. If necessary, a higher level of coupling can always be introduced in a later revision of the ABI (e. g., by providing more information about TM-pure code to the library).
Let us now look at the guarantees in detail. Guarantee L1 in Table 4.2 is essential in that it requires the TM runtime library to stick to a valid TSO during the execution of a transaction. Values returned by the library (e. g., results of transactional loads) are input to TM-pure code, so the TM-pure code will see a valid execution that could have happened even with a sequential execution of all transactions.
The counterpart to L1 is C1, which requires TM-pure code to be independent of a specific TSO choice. This allows the TM runtime library to change TSO during the execution of a transaction because the TM-pure code’s semantics or
Compiler
C1 TM-pure must be independent of TSO.
C2 Preserve sequenced–before/happens–before of memory accesses in race-free code.
TM runtime library
L1 Pick a valid TSO dynamically and only return values consistent with TSO. Change TSO without abort only if change is transpar- ent to TM-pure code.
L2 TSO and memory accesses must be consistent with happens– before. No race conditions must be introduced.
Table 4.2: High-level guarantees provided by a compiler and a TM runtime library, respectively.
safety are guaranteed to not be affected. This is important because otherwise, transactions could not be executed optimistically; they would either have to abort if another transaction commits, or the TM would have to select a TSO a priori and would have to know about the tentative updates of previous transac- tions, which is not possible for all possible code. However, TSO is only allowed to change if the TM runtime library would have returned the same values from previous operations for the newly chosen TSO (i. e., the change must not be observable by TM-pure code).
Furthermore, C1 also requires that TM-pure code is race-free if one would ignore the TSO contributions to synchronizes–with. While this is obvious for code accessing no shared state (e. g., only accessing the thread’s stack), it re- quires other code accessing shared state to be properly synchronized and to not conflict with any synchronization internal to the TM runtime library.
The first part of L2 is a straightforward requirement that is also part of the language-level specification (Section 4.1). It restricts which TSO choices are valid. With the current ABI, the added restrictions are relatively strong be- cause the compiler only instruments transactional code, it only communicates happens–before via the order of calls to the ABI functions, and because the TM runtime library is only active during the execution of transactions. This means in turn that the TM runtime library has to assume that nontransactional code before a transaction synchronized with other threads (and thus expanded happens–before with more relations than those resulting from TSO). This also applies to the nontransactional code executed after returning from a transac- tion’s commit function. Therefore, the TM runtime library has to ensure that all operations before the start of the transaction (including previously commit- ted functions in other threads) are visible to transactional memory accesses and to TM-pure code (i. e., publication safety). Likewise, after returning from a commit function, the TM runtime library’s TSO choice must be final because subsequent nontransactional code could rely on this choice and could communi- cate it to other threads. If the library would know that nontransactional code would be free of synchronization and side-effects, it could choose TSO more freely. However, this information is not provided by the current ABI, so the
choice has to be conservative.
Finally, together with the second part of L2, C2 ensures that the speculative execution of source code without race conditions is still race-free and consistent with happens–before. This is a joint responsibility of the compiler and the library because the former has to properly communicate the language-level memory ac- cesses to the latter. Both have to ensure that no potential race conditions are introduced (e. g., by requesting or making accesses to data that would not be accessed by the abstract machine). This puts restrictions on the implementa- tions of concurrency control algorithms in the library and the transformations by a compiler (e. g., reordering and prefetching).
I will provide more details about the high-level guarantees in Table 4.2 in what follows, but one can already see that together, they roughly ensure that a TM implementation adheres to the C++ TM specification: Active transactions execute as if in isolation, TSO and individual executions are consistent with happens–before, TM-pure code is not affected by a dynamic selection of TSO at runtime, and race-free code is executed in a race-free manner. Of course, this does not make the speculative execution completely transparent (e. g., because of a potentially higher resource usage than with sequential execution), but both the compiler and the library are allowed to separately make reasonable imple- mentation choices that satisfy the as–if rule.