• No se han encontrado resultados

A highly portable heterogeneous implementation of

N/A
N/A
Protected

Academic year: 2023

Share "A highly portable heterogeneous implementation of"

Copied!
48
0
0

Texto completo

(1)

A highly portable heterogeneous implementation of

symmetry-preserving methods for magnetohydrodynamics

J.A. Hopman, F.X. Trias and J.Rigola

Heat and Mass Transfer Technological Center (CTTC)

Technical University of Catalonia (UPC), Terrassa, Spain

(2)

Outline

Symmetry preserving methods in MHD HPC 2 framework

Exploiting symmetries in geometry

2

(3)

MHD introduction

(4)

MHD introduction

Liquid metals in magnetic field

3

(5)

MHD introduction

(6)

MHD introduction

3

(7)

MHD introduction

(8)

MHD introduction

3

(9)

MHD introduction

(10)

MHD introduction

3

(11)

Following method of Ni et al. 1, 2

Collocated + staggered current densities, j Conserves current density

Scheme basics

Update U using projection method Solve 2

nd

Poisson equation for φ:

Update j

Interpolate to cell center to calculate F

Lor

Lorentz force implementation

(12)

Following method of Trias et al. 1

Conserve physical properties by mimicking continuous operators

• Use projected distances in gradients

• Use midpoint interpolation

• Use face-volume weighted interpolation

Consequences for MHD

Avoid iterative correction schemes

Conserve total momentum from Lorentz force:

Preserving symmetries

5

1Trias, F. X., Lehmkuhl, O., Oliva, A., Pérez-Segarra, C. D., & Verstappen, R. W. C. P. (2014). Symmetry-preserving discretization of Navier–Stokes equations on collocated unstructured grids. Journal of Computational Physics, 258, 246-267.

(13)

Case 1: 2D Taylor-Green vortex in transverse magnetic field

HydroDynamic TGV

(14)

Case 1: 2D Taylor-Green vortex in transverse magnetic field

6

HydroDynamic TGV Transverse B-field

(15)

Case 1: 2D Taylor-Green vortex in transverse magnetic field

HydroDynamic TGV Transverse B-field Imposed φ-field

(16)

Case 1: 2D Taylor-Green vortex in transverse magnetic field

6

= +

HydroDynamic TGV Transverse B-field Imposed φ-field

Magneto-

(17)

Case 2: MHD duct flow

(18)

2D Taylor-Green vortex

Should not generate Lorentz Force Should not dissipate energy

Results

Symmetry Preserving method outperforms Ni method

Especially on distorted grids

Accuracy results

8

(19)

Stability for highly distorted meshes

Taylor green vortex M-profile in duct

(20)

Highly-portable code for HPC

Stencil based → Algebra based

Only a few algebraic kernels are needed

HPC 2

10

(21)

From continuous NS equations:

To discrete algebraic equations:

Using three kernels only:

𝒚 ← 𝐴𝒙, 𝒛 ← 𝑎𝒙 + 𝑏𝒚, 𝑟 ← 𝒙 ∙ 𝒚

Algebraic kernels

SpMV axpy dot

(22)

Memory boundedness

12

Arithmetic intensity (FLOP/byte)

(23)

Memory boundedness

dot,

axpy

(24)

Memory boundedness

12

Arithmetic intensity (FLOP/byte)

dot, axpy

SpMV

(25)

HPC 2 : hierarchy

multiple nodes interconnected via

high-bandwidth network

(26)

X.Álvarez, A.Gorobets, and F.X.Trias. "A hierarchical parallel implementation for heterogeneous computing. Application to algebra-based CFD simulations on hybrid supercomputers". Computer & Fluids, 214:104768, 2021.

HPC 2 : hierarchy

14

multiple multi-core CPU per node OpenMP is used at this level

multiple nodes interconnected via high-bandwidth network

MPI is used at this level

(27)

multiple accelerators per node

HPC 2 : hierarchy

multiple multi-core CPU per node OpenMP is used at this level

multiple nodes interconnected

via high-bandwidth network

(28)

X.Álvarez, A.Gorobets, and F.X.Trias. "A hierarchical parallel implementation for heterogeneous computing. Application to algebra-based CFD simulations on hybrid supercomputers". Computer & Fluids, 214:104768, 2021.

HPC 2 : hierarchy

16

(29)

HPC 2 : hierarchy

1. The MPI process

(30)

X.Álvarez, A.Gorobets, and F.X.Trias. "A hierarchical parallel implementation for heterogeneous computing. Application to algebra-based CFD simulations on hybrid supercomputers". Computer & Fluids, 214:104768, 2021.

HPC 2 : hierarchy

16

1. The MPI process

2. The host and co-processors

(31)

HPC 2 : hierarchy

1. The MPI process

2. The host and co-processors 3. Multiple NUMA nodes in a

manycore CPU

(32)

X.Álvarez, A.Gorobets, and F.X.Trias. "A hierarchical parallel implementation for heterogeneous computing. Application to algebra-based CFD simulations on hybrid supercomputers". Computer & Fluids, 214:104768, 2021.

HPC 2 : tested architectures

17

MareNostrum 4 Lomonosov-2 TSUBAME3.0

rank #42

3456 nodes with:

2× Intel Xeon 8160 1× Intel Omni-Path

rank #31

540 nodes with:

2× Intel Xeon E5-2680 v4 4× NVIDIA Tesla P100 4× Intel Omni-Path rank #156

1696 nodes with:

2× Intel Xeon E5-2697 v3

1× NVIDIA Tesla K40M

1× InfiniBand FDR

(33)

Exploiting symmetries

1 Symmetry 2 Symmetries

(34)

Exploiting symmetries

18

1 Symmetry 2 Symmetries

Symmetry-aware ordering

(35)

Exploiting symmetries

1 Symmetry 2 Symmetries

(36)

Exploiting symmetries

18

1 Symmetry 2 Symmetries

Symmetry-aware ordering

(37)

Exploiting symmetries

1 Symmetry 2 Symmetries

(38)

Exploiting symmetries

18

1 Symmetry 2 Symmetries

Symmetry-aware ordering

(39)

Block-diagonalising L

𝐿 = ∈ 𝑅

𝑁×𝑁

𝑆 = 1

2 ∈ 𝑅

𝑁×𝑁

෠𝐿 = 𝑆𝐿𝑆

−1

= L

inn

L

out

L

out

L

inn

I

N/2

I

N/2

I

N/2

-I

N/2

L

inn

+ L

out

0

0 L

inn

- L

out

(40)

Block-diagonalising L

𝐿 = ∈ 𝑅

𝑁×𝑁

𝑆 = 1

2 ∈ 𝑅

𝑁×𝑁

෠𝐿 = 𝑆𝐿𝑆

−1

=

19

L

inn

L

out

L

out

L

inn

I

N/2

I

N/2

I

N/2

-I

N/2

L

inn

+ L

out

0

0 L

inn

- L

out

(41)

Block-diagonalising L

𝐿 = ∈ 𝑅

𝑁×𝑁

𝑆 = 1

2 ∈ 𝑅

𝑁×𝑁

෠𝐿 = 𝑆𝐿𝑆

−1

= L

inn

L

out

L

out

L

inn

I

N/2

I

N/2

I

N/2

-I

N/2

L

inn

+ L

out

0

0 L

inn

- L

out

(42)

Block-diagonalising L

𝐿 = ∈ 𝑅

𝑁×𝑁

𝑆 = 1

2 ∈ 𝑅

𝑁×𝑁

෠𝐿 = 𝑆𝐿𝑆

−1

=

19

L

inn

L

out

L

out

L

inn

I

N/2

I

N/2

I

N/2

-I

N/2

L

inn

+ L

out

0

0 L

inn

- L

out

(43)

Block-diagonalising L

𝐿 = ∈ 𝑅

𝑁×𝑁

𝑆 = 1

2 ∈ 𝑅

𝑁×𝑁

෠𝐿 = 𝑆𝐿𝑆

−1

= L

inn

L

out

L

out

L

inn

I

N/2

I

N/2

I

N/2

-I

N/2

L

inn

+ L

out

0

0 L

inn

- L

out

(44)

Rewriting the SpMV product:

into an SpMM product

Sparse matrix-matrix product

20

A reduction in time complexity

A reduction in memory footprint

An increase in arithmetic intensity

(45)

Increasing arithmetic intensity

dot,

(46)

Increasing arithmetic intensity

21

Arithmetic intensity (FLOP/byte)

dot, axpy

SpMV

SpMM

(47)

Summarising

Symmetry preserving methods in MHD HPC 2 framework

Exploiting symmetries in geometry

(48)

Thank you for attending!

23

Referencias

Documento similar

Depending on the shape of the objects this may lead to significant variations in the determined hydrodynamic diameter. In this case Depolarized Dynamic Light Scattering would be

Phys.. Back, “Fourier transform imaging of spin vortex eigenmodes,” Phys. Zaspel, “Magnon modes for thin circular vortex-state magnetic dots,” Appl. Kim, “Dynamic origin

Implementation of the ELISA magnetic immunoassay for atrazine detection by using a microfluidic chip with BDD Pt-NPs electrode as detection platform.. Implementation of the

• Proves an alternative and better boundary condition for force-free field extrapolations into the corona (Metcalf et al. 1995; Socas-Navarro, 2005).. • Gives us an extra layer

In addition, GBM are also highly heterogeneous tumors (Soeda et al. 2015), which highlights the importance of targeting telomeres independently of telomere length,

Spatial mapping of the zero-bias conductance in magnetic field reveals the vortex lattice, which allows us to unequivocally link the observed conductance gap to superconductivity

The simplest realization of a topological superconducting state requires [4–13] (i) a small number of conduction channels, (ii) a band structure modified by spin-orbit coupling,

Calabrese, Evolution of entanglement entropy following a quantum quench: Analytic results for the XY chain in a transverse magnetic field , Physical Review A 78 (2008) 010306.