Órganos consultivos

Almost all the major applications used today have some form of dynamically generated code being executed inside. Web browsers aggressively use the latest Just-in-time (JIT) compilation techniques to speed up Javascript execution. Dynamic instrumentation and dynamic translation of code is also common in trace tools that inject performance analysis code for gathering metrics from running applications. However, program flow analysis with hardware trace of such code presents many challenges. To explain the current limitations, we first define its scope, some background on code execution and then move towards an example where it manifests.

For a given target process P, Figure 7.1 illustrates the operating system’s view of the process memory. The virtual address space for P has some content in the form of pages, a number of which are in the executable Virtual Memory Areas (VMA), typically the .text sections of a process, which contains the executable code of the program and shared library code. This is shown as VMA1 and contains executable file-backed pages which we name as code

CS

_r Heap BSS Data Text Process P

.

Page₄₂ Page₄₃

._.

.

vm_end vm_start Page₄₄ Page₀ Page₁ Page₂ VMA₀ VMA₁ vm_end vm_start File Backed Anonymous

CS

Figure 7.1 Runtime and file-backed code section for a process P as observed by the OS

P. Consider that P now generates dynamically compiled code. Typically for such code, the

memory is dynamically allocated on the heap and the code copied to the assigned pages which are then marked as executable. Unlike pages in VMA1, these pages in VMA0 are anonymous

and do not contain a backing file. We name these pages as part of the code section CSr. At

runtime, some of these pages may need to be modified and revised. As seen in figure 7.2, at execution, each revision or a new dynamically compiled section can be considered as a single segment CSrn, where n is the number of times a new section is added or a previous

section revised and rewritten at runtime. This behavior is common for every userspace and in-kernel dynamically executed code. As an example, for a userspace JIT compiled network packet filter based on eBPF, CSr may represent a single page worth of dynamically compiled

filter code which may be modified repeatedly at runtime, based on policy requirements. We elaborate more on this in section 7.5. We can now define the process control flow function

F (P ) as, F (P ) = F (CSp) ∪ n X i=1 F (CSri)

where F denotes the instruction flow of a given code and P

signifies the union of individual flows of CSrri. However, a software-only approach for generating the flow F (P ) would also

involve extra code sections before each CSr and add additional instructions to the critical

execution. As discussed, in case of JIT compiled code, this is only currently achieved using JIT compiler specific functions or language dependent APIs [65, 68].

CS_r1 Runtime Code Pages CS_p CS_r2 CS_rn T_r1 T_r1 T_rn

Code Execution Flow

T_p

Hardware Trace Process

Code

Figure 7.2 Corresponding hardware pages

thus generating true execution profiles at very low-overhead. We have discussed this in de- tail in our previous work [60]. Therefore, for each branch encountered in CSp and CSr, the

processor generates encoded trace packets representing the decision on a branch taken or not taken, along with the instruction pointer (IP) if required. We represent these trace packets symbolically for CSp and CSr as Tp and Tr. For branch traces, the decoding of this enco-

ded trace requires the availability on disk of the static binaries of the running process, as the pages belonging to this VMA are file-backed. Therefore when traced with hardware, the process code section control flow F (CSp) can now be derived as,

F (CSp) = Π(CSp, Tp)

where Π is a map and merge function that takes the statically available process code segment (CSp) and the corresponding hardware trace packets (Tp) as input, and generates the flow as

output. However, for dynamically generated CSrnsection, it is not possible to faithfully obtain F (CSrn), as the packets Trndon’t map to any available code segments, since they belong to a

VMA which is anonymous memory. For example, JIT compilers cause in memory execution of short sections of dynamically generated code which the hardware trace decoders fail to account for, as they expect static binaries while decoding. As discussed in the previous section, a solution to the problem of non-availability of CSr sections is use JIT or language specific

APIs that periodically dump runtime compiled code when it is generated and executed. However, this may require recompilation of the JIT supported runtime, which may not be in diagnosing production systems that don’t allow code modifications. Moreover, this also adds the undesired API code in the critical flow path which we observe eventually in F (P ). We observed this problem throughout in many locations in the Linux kernel where modifying

code for optimization is a fairly common occurrence in trace, network packet filter and security subsystems. The problem is more acute in userspace where multiple languages may be using JIT compilation, and APIs to dump and analyze JIT code may not always be available. This motivated us to approach the limitation of reconstruction in state-of-the-art hardware trace systems, from a different perspective. We therefore propose a kernel-assisted technique that monitors and keeps track of executable code memory to record CSrn sections transparently,

in order to generate accurate program flow. Therefore, to get the flow of a given dynamic code section CSrn we can define F (CSrn) as,

F (CSrn) = Γ(CSrn, vma(CSrn), ts(CSrn), Trn)

Function Γ takes as input the code section (CSrn), address of the VMA in which this CSrnsec-

tion belongs (vma(CSrn)), along with the timestamp of the revision of this section (ts(CSrn))

and the associated hardware trace for the section (Trn). We store the timestamp, address and

content of each new dynamic code section revision with our FlowJIT technique, which then allows reconstruction of the hardware trace, something not otherwise possible.

In document CONGRESO DE LOS DIPUTADOS (página 197-200)

CS

.

.

.

.

.

.

.

..

.

.

.

.

.

CS

._.