EL EGOÍSMO
3.3 · CONSIDERACIONES FAVORABLES
ters 3 through 5 were only applied to that workflow. Applying the same approach to other workflows would further test the usefulness of the methodology.
• No filesystem parameters were varied throughout any of the testing, since users gen- erally may not, or are not able to, change these settings (as described in Chapter 2). However, the results could potentially have a large impact on the read performance for applications.
7.2
Related work
In Chapter 3, the importance of sequential read patterns was demonstrated. There are at least three techniques which can be used to improve the poor performance of non-sequential reads beyond the investigation in Chapter 4. They consist of:
• read prediction or prefetching (within the filesystem),
• preloading data (within layers above the file system, e.g. middleware or extra hardware), • and data layout.
The lower IOPS with random or striding reads could be mitigated by accurate read prediction or prefetching in the filesystem. In a simple sense, this would entail using an algorithm to predict future reads from an application. He et al. (2013) demonstrate that using a pattern pre- diction algorithm (using a virtual filesystem) can reduce I/O latency by up to three times, and was much more effective than Linux readahead. With better prediction of reads, the difference between the sequential reads and striding reads in Chapter 3 would reduce. Jiang et al. (2013) also used a prefetching method and showed some speedup, and Byna et al. (2008) showed improved read rate when using a layer they developed for the MPI-IO library. All these cases show promise for implementing some form of prefetching on the HPC systems used in atmo- spheric science (e.g. JASMIN). Some of the levels of speedup demonstrated in these studies were not enough for smaller striding reads to match the performance of the sequential reads, needing 10 to 100 times speedup for the striding reads. He et al. (2013) demonstrated 3 times speedup, Jiang et al. (2013) showed 1.2-1.6 times speedup, and Byna et al. (2008) showed a maximum of 1.25 times speedup.
Chapter 7. Conclusions and Further Work 7.2. Related work
Chapter 7. Conclusions and Further Work 7.2. Related work Similar in some ways to prefetching, preloading data either predicts what data is required and reads it ahead of time, or reads data using an intermediate stage. A burst buffer is gener- ally located between the parallel filesystem and the HPC compute cluster. This intermediate storage can provide faster access and read rates than the main parallel filesystem. Another ben- efit of a burst buffer system could be that the data once read onto the burst buffer nodes, could be rearranged or reformatted to enable fast reads from the main filesystem, and faster trans- fers to the processing nodes. This would obviously be beneficial looking at the results from this thesis. An example of this method is BurstMem (Wang et al., 2015), which is an example of a burst buffer system. BurstMem showed significantly improved performance over a non-burst buffer system (over twice the bandwidth in some cases). This improved performance, along with the potential to reorganise data in the burst buffer, could significantly reduce the runtime for the STSA application in Chapter 6. This would give similar performance to, if not better than, the x-t chunked file (fastest read) while only storing one version of the data, thus keeping storage costs low. However, BurstMem does not currently support NetCDF and HDF5 files, so has limited usefulness in Atmopsheric science.
Another way of mitigating poor performance due to inefficient read patterns is to change the layout of the data on disk. Chapter 4 showed that simply using HDF5 chunking gives beneficial results for a single read pattern, but not multiple read patterns. Some studies which aim to solve this problem include:
• using the SDS framework (Dong et al., 2013),
• using PLFS to interface between the physical filesystem and a logical file view from an application,
• using the ADIOS BP file format,
• and using the Ceph filesystem to improve random access speed.
The SDS framework improves the performance of multiple read patterns by automatically reformatting data by analysing read patterns of applications, and automatically selects the best version of the dataset for a given application using metadata. This shows clear benefits with regard to reading of the reorganised data being 50 times faster. This is a similar speedup to the chunked files results from Chapter 6 (around 30 times faster comparing unchunked and x-t chunked files), with similar overhead (each requires multiple files). However, the SDS framework does this automatically and transparently to the user.
Chapter 7. Conclusions and Further Work 7.2. Related work PLFS is a virtual filesystem which allows a mapping between the physical data and a vir- tual file which can have a different data order, giving a similar effect to HDF5 chunking. He et al. (2013) showed significant improvement when using PLFS to provide this mapping. How- ever, it does not seem to provide any benefits that chunking does not.
The ADIOS BP format (see Section 2.8), organises data on disk to provide fast access for multiple read patterns (Lofstead et al., 2011). This gives similar benefits to having multiple chunked files, without the storage overhead. Applying this file type and data distribution to the STSA, the BP file format could provide around 20 times speedup, which is very similar to the speedup shown when comparing the unchunked files and x-t chunked files in Table 6.1. However, it is a specific format which would lose the benefits of NetCDF, such as its well established metadata conventions and tools.
Ceph is an object store based file system like Panasas and Lustre (see Appendix A). Un- like those two however, it removes the file allocation tables (metadata describing where files are stored) in favour of using an algorithm to locate data, significantly reducing metadata ac- cess time. The data across the object stores is also randomly distributed (Weil et al., 2006) as opposed to using a round-robin distribution which favours reading chunks in storage order. These two factors could significantly improve the performance of non-sequential read patterns. The distribution of data involves striping (see Section 2.7.2), which Chapter 4 showed can in- teract poorly with chunking. One way to avoid this would be to not chunk the NetCDF4 data, however this would eliminate the option of the inbuilt compression in the NetCDF4 library. Despite this drawback, using Ceph could be very beneficial to atmospheric science workflows, giving a similar level of speedup to ADIOS BP files, while still using NetCDF4.
The compression results from Chapter 4 give similar compression ratios to Kunkel (2017) which showed 40-70% compression ratios for mixed scientific data (including NetCDF4 files) with zlib compression. Kunkel (2017) showed that the decompression speed for these files was around 20 times slower than the observed bandwidth. The results from Chapter 4 show that the overhead for compressed NetCDF4 files, where the read matches the chunk shape, was around two thirds the uncompressed read rate. However, when the reads consist of multi- ple whole chunks, the overhead was negligible. This shows the importance of making good chunking decisions when compressing data. The reads from Kunkel (2017) likely included reads where the read shape and chunk shape were mismatched, increasing the mean uncom-
Chapter 7. Conclusions and Further Work 7.3. Further work