• No se han encontrado resultados

El papel potencial de la biotecnología moderna

In document BIOTECNOLOGIA MODERNA DE LOS ALIMENTOS (página 50-53)

5. Alimentos GM y seguridad alimentaria

5.4 El papel potencial de la biotecnología moderna

While NFS provides a unified storage solution for many connected nodes, the performance of the file system does not scale with the size of the network. For this reason, distributed file systems (DFS) exist to provide a unified storage space across a network of servers. As a DFS is spread across multiple servers (allowing parallelised access without interprocess interference) it generally pro- vides a greater quality of service (QoS) than NFS. The IBIS file system [131], developed in 1985, was one of the first DFSs where the file system was spread across all nodes of the network, allowing all nodes to transparently access any file regardless of whether the file was stored locally or remotely. Modern DFSs now provide dedicated storage servers, each themselves containing high performance storage backends (using techniques such as RAID).

In most modern DFSs, there are four components. This thesis adopts the

naming convention of the Lustre file system; di↵erent file systems may use

OST OST OST OSS 1 OSS 2 OSS 3 OSS 4 MGT MGS 1 MGS 2 Client Client Client Client In terconnect MDT MDS 1 MDS 2

Figure 3.7: An example Lustre configuration with four OSSs and a fail-over MGS and MDS.

Object Storage Targets (OST)

HDDs are usually grouped using RAID (to improve performance and pro-

vide some redundancy), these are then referred to asObject Storage Tar-

gets. The OSTs are used to store the stripe data blocks that make up each

file.

Object Storage Servers (OSS)

One or more OSTs are connected to one or moreObject Storage Servers.

The OSSs are directly responsible for reading and writing file data from and to the OSTs.

Metadata Server (MDS)

Metadata (such as the directory tree, file permissions and file block loca-

tions) is either stored on a dedicatedMetadata Server or is stored on the

OSSs (as in GPFS). The MDS is used by the clients to get file information and file structure, such that they can access the file stripes stored on the OSTs.

Management Server (MGS)

Finally, there are usually one or two Management Servers holding the

server configurations.

The parallel file systems used throughout this thesis di↵er in the number of

OSTs and OSSs in use, as well as the use of a dedicated MDS for the Lustre file system used, and distributed metadata on the GPFS installations used. Other parallel file systems, such as Ceph [137], make use of multiple MDSs in order to improve the performance of metadata updates, which are often identified as a bottleneck in some large, heavily loaded Lustre installations [1].

3.3.1

The Lustre File System

The Lustre file system is used by many of the world’s fastest and largest com-

puters, with up to 66 of the top 100 HPC systems using Lustre in 20102. The

basic architecture of a Lustre file system is shown in Figure 3.7. Although Lus- tre (up to version 2.4) uses only a single MDS, a fail-over MDS and MGS can be present. Additionally, as shown in Figure 3.7, multiple OSSs can be connected to common OSTs and this will again provide some fail-over capability.

Much like previous DFSs, Lustre makes use of file striping to allow load to be distributed across a number of service nodes. The size and width of each stripe (where the width is the number of servers over which to stripe) can be configured on a per-file or per-directory basis. The Lustre installation used in this thesis stripes across 2 OSTs with each stripe being 1 MB in size in its

default configuration. Thelfscommand can be used to view and modify these

settings.

When writing to a Lustre system, the server used for the first stripe is

randomised in order to provide some load balancing between di↵erent clients.

From this point onwards, the data is striped across a number of servers based on the configured stripe width.

OST OST OST OSS 1 OSS 2 OSS 3 OSS 4 MGT Client Client Client Client In terconnect MDT

Fibre Channel Switc

h

Figure 3.8: An example of a GPFS setup with four OSSs connected via a high performance switch to three targets and separate management and metadata targets.

To maintain consistency and allow correct concurrent access to the DFS, Lustre makes use of a distributed lock manager. Each OSS maintains its own file locks and so if two processes attempt to access the same chunk of a file, the OSS will only grant a lock to one of the clients (unless both accesses are read requests).

3.3.2

IBM’s General Parallel File System

The General Parallel File System (GPFS) from IBM operates similarly to Lus- tre; large files are distributed across multiple storage targets using stripes. How-

ever, GPFS di↵ers from Lustre in that all OSSs are connected to all OSTs and

MDTs, usually through a fibre channel switch. This provides additional re-

silience in that many more OSSs can fail before the file system must go o✏ine.

Figure 3.8 demonstrates an example GPFS configuration. Although it is possible to store metadata on the same disks as file data, many installations (including the configuration in use at the University of Warwick at the time of writing) make use of dedicated higher performance data targets.

0 1 2 3 4 5 File File 0 1 hostdir.1 index.1 2 3 hostdir.2 index.2 4 5 hostdir.3 index.3 Application File View

PLFS File View

Figure 3.9: An application’s view of a file and the underlying PLFS container structure.

On GPFS, metadata is maintained by all servers, potentially providing better performance for metadata intensive workloads. As shown by Hedges et al., the file creation rate on GPFS is much higher than on a Lustre system, provided that the files are being created in distinct directories; the use of fine grained directory locking in GPFS makes file creation slower in the same directory [63]. GPFS makes use of a much smaller stripe size than Lustre (typically 16 KB or 64 KB) and sets the stripe width adaptively. For large parallel writes, data can be striped across all available GPFS servers, potentially providing a much greater maximum bandwidth [63].

3.3.3

The Parallel Log-structured File System

On top of parallel file systems like Lustre or GPFS, virtual file systems may provide an additional performance boost by transforming parallel file operations to be more appropriate for the underlying file system. One such example of this is the parallel log-structured file system (PLFS) [11] developed at the Los Alamos National Laboratory (LANL).

PLFS is a virtual file system that makes use of file partitioning and a log- structure (as described in Section 2.2.2) to improve the performance of parallel file operations. Each file within the PLFS mount point appears to an application as though it is a single file; PLFS, however, creates a container structure, with

Minerva Sierra Cab Processor Intel Xeon 5650 Intel Xeon 5660 Intel Xeon E5-2670

CPU Speed 2.66 GHz 2.8 GHz 2.6 GHz

Cores per Node 12 12 16

Memory per Node 24 GB 24 GB 32 GB

Nodes 492 1,856 1,200

Interconnect — QLogic TrueScale 4⇥QDR InfiniBand —

File System See Table 3.2 See Table 3.3 See Table 3.3

Table 3.1: Hardware specification of the Minerva, Sierra and Cab supercomput- ers.

Minerva File System

File System GPFS I/O servers 2 Theoretical Bandwidtha ⇡4 GB/s Storage Metadata Number of Disks 96 24 Disk Size 2 TB 300 GB Spindle Speed 7,200 RPM 15,000 RPM

Bus Connection Nearline SAS SAS

RAID Configuration Level 6 (8 + 2) Level 1+0

Table 3.2: Configuration for the GPFS installation connected to Minerva. aTheoretical Bandwidth refers the maximum rate at which data can be transferred to the

file servers and is therefore bounded only by the network interconnect.

a data file and an index for each process or compute node. This provides each process with its own unique file stream, potentially increasing the available bandwidth. Figure 3.9 demonstrates how a six rank (two processes per rank) execution would view a single file and how it would be stored within the PLFS backend directory.

In order to use PLFS on a supercomputer, either: the FUSE file system driver must be installed; a custom MPI library must be built; or applications must be rewritten to use the PLFS API directly. In Chapter 5 an alternative solution is provided, in addition to an in-depth investigation into why PLFS achieves the performance gains reported by its developers [11].

In document BIOTECNOLOGIA MODERNA DE LOS ALIMENTOS (página 50-53)