Defensor General - Ministerio Público CABA
MINISTERIO PÚBLICO DE LA DEFENSA
Active Segment hash SSD flag DRAM packet bitmap DRAM queue index SSD queue index 1 4 2 1 4 4 Req.
cnt. Prev index HT entry index Chunk 1 index . . .
1 4 4 4 4 . . .
Next index Chunk N index
4
Prev index HT entry index
4 4 4
Next index
SSD segment map DRAM segment map Hash table entry
Size (B)
Size (B)
Size (B)
Figure 5.5: H2C data structures
H2C uses a number of data structures to manage the two-layer cache (see Fig. 5.5). Each processing core has a dedicated instance of each data structure. This means that no data structures are concurrently accessed by cores, with the exception, of course, of the lockless rings used to pass information between Network I/O cores and processing cores and between processing cores and SSD I/O cores.
The main data structure is a hash table used for indexing all the segments stored in the portion of the cache (DRAM+SSD) assigned to the corresponding core. It is implemented using an open-addressed hash table similar to the one used in [138]. The goal of this hash table design is to minimise the access latency. Therefore, it has been designed to enable the execution of a lookup/insertion with at most a single DRAM memory access. This was achieved by dimensioning buckets to a cache line (i.e., 64 B in our architecture) and storing up to four entries in a single bucket. In such a way, even if the bucket is not in CPU cache, it can be retrieved with a single memory read which copies the entire bucket CPU L1 cache. At this stage, even in case of collisions (i.e., multiple entries in the same bucket), it is possible to locate the desired entry just by iterating the bucket, which is in L1 cache and can be accessed very quickly. It may still occur to have more than four colliding entries, in which case they cannot all fit in a bucket. This
case can be managed using chaining, but it is expected to be rare if the hash table is well dimensioned. Each entry of the hash table is composed of 6 fields for a total of 16 B and comprises the following fields.
• An Active field, indicating whether the entry is valid or not. This entry is used for fast deletion (i.e., to remove an entry from the hash table we set this byte to 0 rather than deleting the whole entry). • A segment hash field, which is a 32-bit hash value of the segment identifier.
• An SSD flag, indicating whether the entire segment is stored in the SSD or not.
• A DRAM packet bitmap, indicating which packets of the segment are stored in the DRAM. • Finally, a DRAM segment map pointer and a SSD segment map pointer, which store a pointer to the
associated DRAM/SSD segment map respectively.
Every hash table entry points to a DRAM segment map and/or a SSD segment map depending on where the packet is cached.
The DRAM segment map contains primarily an array of N elements (i.e., the number of packet in a segment) pointing to the buffers storing the packets belonging to a given segment. DRAM segment maps of various segments are organised as a linked list and ordered according to the LRU policy. We call this list DRAM replacement queue. Each DRAM segment map contains the following fields.
• A Request counter field, counting how many times, R, an item has been hit since insertion in DRAM. This is used to enable the probationary insertion mechanism described in Sec. 5.2.2 • Previous and Next fields, storing the indices of the DRAM segment maps precending and following
this element in the LRU queue.
• A Hash table pointer pointing back to the hash table entry associated to this segment. This pointer is needed to make it possible to update the hash table entry upon eviction of a segment from DRAM. • An array of pointers to all the buffers storing the chunks belonging to a segment.
The SSD segment map only contains pointers to previous and next elements as well as a pointer to the associated hash table entry. SSD segment maps of various segments are organised as a linked list and ordered according to the LRU policy. We call this list SSD replacement queue. It should be noted that while a DRAM segment map stores the address of each chunk, the SSD map does not store any pointer to the related segment address. This is because in DRAM chunks belonging to a segment are not necessarily stored in contiguous locations. In SSD all chunks of a segment are stored contiguously and their position in SSD reflects the position of their associated SSD segment map in the SSD replacement queue. Therefore the SSD segment map pointer stored in the hash table entry is sufficient to locate the segment in SSD.
To manage free segments in SSD, each processing core uses a FIFO queue of free segments in the SSD, which we named SSD segment pool. When a processing core needs to request the insertion of a
segment to SSD to an SSD I/O core, it inserts it at the location indicated at the head of the queue. When a segment is removed from SSD, its location is appended to the tail of the queue. It is important to note that to avoid concurrent access to the main hash table between SSD I/O and processing cores, all state about SSD is maintained by processing cores, which provide SSD I/O cores with all memory addresses required to perform the requested I/O operations.
Finally, all data packets in DRAM are stored in a lock-free pool of available pre-allocated packet buffers stored in DRAM, named DRAM packet pool. Whenever a packet is retrieved from the NIC or the SSD, it is stored in DRAM using a free buffer of the packet pool. Conversely, when a packet is evicted from DRAM, the corresponding object is added back to the packet pool. It should be noted that the memory used by the DRAM packet pool is entirely allocated at startup time and the allocation and release of packet buffers is managed in userspace. This is done to avoid the overhead of per-packet calls to malloc and free system calls.
The design of the control data structures of H2C exploits the fact that chunks belonging to the same content are likely to be requested sequentially to reduce the frequency of DRAM memory access to perform an H2C lookup. Since looking up chunks belonging to the same segment requires access to the same hash table bucket and same DRAM/SSD segment map, when chunks are looked up sequentially and close in time to each other, the related memory entries will be fetched from DRAM only when the first chunk of the segment is looked up. When subsequent chunks of the segment are looked up, there is a high probability they will be found in CPU L1 cache.
In conclusion, we note that in the design of the data structures we focused primarily on reducing processing overhead given the line speed operations objective, sometimes at the cost of losing memory efficiency. However, despite this, the memory footprint of the data structures used to manage H2C is still modest, as we show in Tab. 5.2.
Table 5.2: Memory overhead of H2C control data structures, assuming chunk size of 4 KB and N = 8
SSD DRAM
16 GB 32 GB 64 GB 128 GB 146 MB 169 MB 214 MB 256 GB 270 MB 293 MB 338 MB 512 GB 518 MB 541 MB 586 MB