ANÁLISIS DEL PRECEDENTE HUATUCO
4.3 Análisis de sentencias del Tribunal Constitucional
4.3.2 Análisis de la Sentencia recaída en el expediente N° 02083-2013-PA/TC
This is the part of the toolset responsible for sending batches of requests. It has to abide by both functional and performance requirements. The functional require- ments encompass that it has to be able to send any number of requests with parallel and sequential duplication and the requested delay. The performance requirement entails that it tries to send parallel requests as fast as possible.
The main functionality of the component is sending the parallel requests as fast as possible, but to the best of our knowledge, there are not many examples available on how to approach this using the Python language. There were plenty of documented ways to host high-performance servers, but not high-performance clients. This sit- uation was to be expected as we can imagine only some special Python-based password bruteforcing tools have similar performance requirements. Therefore, this was an involved trial-and-error type of process. We can discern four stages in which the performance of the batch sender has been improved.
1. Multi-threading - Initially, it seemed best to use the default Python HTTP li- brary called ’Requests’ and spawn multiple worker threads that all send re- quests in parallel. However, it turned out that using multiple threads would not be an improvement after all because the Python documentation shows that the Global Interpreter Lock (GIL) does not allow multiple threads to access python objects at once (Wouters, 2017). The GIL is a mutex (MUTual EXclu-
sive access) that prevents multiple threads from executing Python bytecode at once and is required because the CPython memory management is not thread-safe. It mentions that long-running I/O operations could happen out- side the GIL, but the functions in the Requests library as a whole will still run sequentially. Therefore, this option was abandoned.
2. Multi-processing - To avoid the GIL threading issues, the alternative was sought in multiple Python worker processes. These processes do not share normal variable values and therefore, can execute in parallel. A bi-directional process-safe queue was used to pass the request to the process and return the result. This worked, but unfortunately, this option also greatly disturbed the Python debugging ability. Apparently, the Python debugger was not designed to work with multiple processes, and when the debugger would encounter one exiting process, it would think the whole program had shut down and would exit as well. By deciding not to debug this part of the program, the developer can avoid this.
At this point, using a network protocol analyser called Wireshark (Wireshark, 2019), the raw performance of the tool was compared to the performance of Sakurity Racer when sending a single request in parallel. The raw perfor- mance was measured concerning the average time difference between two consecutive requests. For this test, the account registration request to the OWASP WebGoat of the first example in section 1.1.2 was used. The results are shown in figure 5.4 and figure 5.5. As clearly shown, the time-differences between requests made by Sakurity Racer are about 100 times smaller than the time-differences between requests made by the CompuRacer. Also, be- cause of the much slower performance, the CompuRacer was not able to trigger the account-creation race condition. Therefore, this option was aban- doned, as well.
Figure 5.4:The figure shows multiple parallel requests captured with Wireshark which were made by the Sakurity Racer tool. The time-column shows the difference in seconds at microsecond resolution between two subsequent HTTP requests.
Figure 5.5:The figure shows multiple parallel requests captured with Wireshark which were made by the CompuRacer toolset when using multiple processes and the ’requests’ library for sending. The time-column shows the difference in seconds at microsecond resolution between two subsequent HTTP requests.
3. Asynchronous- The next option that was considered is sending the requests using an asynchronous library called ’aiohttp’ [ref] (instead of the very bulky ’Requests’ library). Since Python 3, it supports both synchronous and asyn- chronous (event-based) programming. This method does not execute code purely sequentially or in parallel but uses a hybrid of the two options.
An event-loop is used to sequentially queue and process actions, but all ac- tions that cannot be executed immediately, like network requests, are handed over to a parallel process (from a pool of available processes). The event-loop then continues to process the next action while the parallel process executes the delegated action. When the delegated action is completed, the callback with the result is added back to the event-loop to be processed sequentially. We used a faster version of the built-in ’AsyncIO’ event-loop called ’uvloop’ [ref] which can achieve a 2 to 3 times speedup.
Figure 5.6: The figure shows the time difference in seconds at microsecond resolution between two HTTP requests for the CompuRacer toolset when using the asynchronous ’Aiohttp’ and ’uvloop’ libraries for sending.
Using this new method, the same test, as described above, is executed again to evaluate the improved performance. The results are shown in figure 5.6. This is a major speedup of about 800 times, which results in a performance that is about eight times better than that of Sakurity Racer.
4. Last-byte-synchronisation- The last improvement to the tool was made us- ing a method called last-byte-synchronisation. We altered the source code of the ’aiohttp’ to include the ability to start sending some packets in parallel and then synchronise the sending of the last byte of the body. This last byte is sent at a specific point in time for all parallel requests. A limitation of this method is the fact that requests without a body cannot be synchronised. As only HTTP requests without a body but with huge headers are expected to be split over multiple TCP/IP packets, this seems to be a minor issue.
Next to these clear performance stages, some other performance measures can also be discerned. First, before sending, all request-objects are pre-created, and the processing of responses is also postponed until all responses are received. Both measures are meant to avoid any processing interference.
Next to this, when the goal is for instance to trigger a race condition between the login and shopping based functionality, we might decide to send ten copies of two different requests (like a login and a ’put product x in my shopping basket’) in parallel. However, as the tool would originally first create 25 asynchronous sending tasks for the first request and then 25 tasks for the other, the time difference between the first copy of the first request and the first (or last) copy of the second request would often be much higher than the optimal performance of the tool promises. In our case, at around 25 or more duplicated requests per type and using two or more different types, a significant delay was often observed. To solve this issue, the order in which asynchronous sending tasks for different types of requests are created, is now randomised and this seemed to solve the issue.