2.2.3. Gastronomía tailandesa
2.2.4.1. Estudio de mercado
Throughout this thesis we utilise a collection of benchmark applications and a class of small applications referred to as ‘mini-apps’. Such applications are designed to exhibit the computational behaviour of key algorithms, or processes, but in a simplistic and portable framework. Projects such as the Mantevo project at SNL provide a suite of mini-apps targeted at different scientific domains; they are designed to quickly evaluate hardware, both novel and tradi- tional, and software methods [49].
Benchmark applications tend to differ from mini-apps in terms of both size and complexity, as they are designed to more accurately represent the computational needs of production grade applications. Benchmarks can play a key role in the procurement process, by evaluating a platform for both compati- bility and performance. Such applications often contain reduced computational complexity and are accompanied with reduced problem sets, allowing for the fast turnaround of computational results during machine evaluation.
Both classes of application are suitable for analysis, specifically with respect to memory consumption, as they are designed to mimic the methods of larger codes and so will exhibit many of the same properties and artefacts.
A.2.1
Chimaera (AWE)
The Chimaera benchmark is a 3D particle transport code developed and main- tained by AWE. It employs a wavefront design pattern, which executes a series of sweeps through the 3D data array. The purpose of the benchmark is the replication of operational behaviour of larger internal codes which occupy a considerable proportion of parallel runtime on the supercomputing facilities of AWE.
The code shares many similarities with the ubiquitous Sweep3D application developed by the LANL in the United States, but is considerably larger and more complex in its operation.
A.2.2
Orthrus (AWE)
Orthrus was initially developed by Dawes at AWE plc, to assess the parallel scalability of generic 3D implicitly solved linear diffusion problems. The ap- plication serves as a driver for the third-party linear solver librariesPETSc[8], from Argonne National Laboratory; andhypre[35], from LLNL. The applica- tion constructs a 3D sparse matrix and then drives the preconditioner-solvers provided by the two aforementioned libraries.
Orthrus forms part of the machine evaluation benchmark suite used to drive procurement decisions for AWE. Timing instrumentation is provided via the Ichnaea (PMTM) library [4].
A.2.3
POP (LANL)
The Parallel Ocean Program (POP) is a 3D ocean circulation model, solving equations for fluid motion on a sphere, using finite difference discretisation. POP forms part of the the Community Climate System Model, and as such it can be coupled with other climate simulators for more comprehensive modelling.
A.2.4
SNAP (LANL)
SNAP is a 3D SN proxy application, for the LANL neutron transport code PARTISN, designed to mimic memory requirements and communication pat- terns rather than physics. As such SNAP allows the configuration of a number of runtime parameters such as data cells, energy groups and angles, each of which can have a dramatic effect on both runtime and memory consumption.
SNAP also supports a level of hybrid parallelism with MPI and OpenMP, making it a suitable code with which to investigate memory effects.
A.2.5
Sweep3D (LANL)
Sweep3D was the precursor to the SNAP application, and shares many of the same characteristics both in terms of application and implementation.
A.2.6
NPB (NASA)
NASA maintains a benchmark suite of applications, referred to as the NAS Parallel Benchmark (NPB) suite [5]. There are many variants of the suite, where different programming languages or parallelisation paradigms are implemented. The Fortran variant of NPB makes heavy use of statically allocated arrays, as the runtime core count and problem are specified at compile time, making an efficient, but non-portable binary.
MG
MG is a multi grid solver utilising the Poisson solver. It is quite a memory intensive application despite having low heap usage, due to the use of static allocations discussed above.
A.2.7
LAMMPS (SNL)
LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) is a framework for classical molecular dynamics. It is designed for modelling sys- tems of millions, and even billions, of particles on large scale HPC platforms. Whilst LAMMPS contains support for a wide range of particles and interaction models it also supports modification to allow the user to develop more custom behaviour.
Written in C++, LAMMPS has support for Fortran interfaces and is inher- ently designed as a parallel application using MPI, and even includes some GPU support.
A.2.8
MiniFE (SNL)
MiniFE is a proxy application, representing key functionality from the Sandia SIERRA suite of finite-element applications in a small and portable application. It is used to test programming languages and parallelisation models. It is an instructed implicit finite-element solver using a sparse linear system, constructed
from steady-state conduction equations.
We present an evaluation of MiniFE’s memory consumption characteristics in [106], where we investigate the effects of problem size and core count of temporal memory consumption.
A.2.9
phdMesh (SNL)
phdMesh is a finite element data structure library for parallel heterogeneous direct unstructured meshes, developed at SNL and initially included as part of the Mantevo benchmark suite, and is part of the Trilinos project.
Our usage of the code is based upon the ‘gears’ problem, which undertakes contact search on the unstructured grid. The use of an unstructured mesh makes it an interesting application to analyse in terms of memory consumption, as it is likely to exhibit a different profile to traditional structured mesh applications.
The version of phdMesh utilised in our research is written in C++, with a particularly high rate of object creation and destruction. This makes it a very interesting code to evaluate the performance of tracing tools with. In [106] we demonstrated that WMTools exhibited an application slowdown of up to 11.5× slowdown when profiles the code, but also that this behaviour is in line with other tools.
A.2.10
Lare2D (Warwick)
Lare2D is a 2D variant of the Lare3D [3] application. Both are Lagrangian- remap codes for solving the non-linear MHD equations. The original code development was motivated by the study of solar corona, and their accurate simulation.
A.3
Summary
In this chapter we have outlined the machines and applications used throughout this thesis. We use these applications to demonstrate the capabilities of our
developed tools and methodologies.
The use of a selection of supercomputers enables us to test our implementa- tion and analyse application behaviour at large scale. As many memory artefacts are only exhibited at large scale, it is important to capture them in real world scenarios.