• No se han encontrado resultados

BSC – 14th, 15th March 2022 SimulationenvironmentforLifeSciences

N/A
N/A
Protected

Academic year: 2023

Share "BSC – 14th, 15th March 2022 SimulationenvironmentforLifeSciences"

Copied!
38
0
0

Texto completo

(1)

www.bsc.es

Simulation environment for Life Sciences

BSC – 14th, 15th March 2022

(2)

Objective

Overview of molecular simulation technologies used in Life Sciences and their specific adaptation to

HPC environment.

(3)

Aknowledgements

(4)

OUTLINE

• Macromolecular dynamics. A key issue to understand molecular recognition

– Why dynamic properties?

– The concept of “ensemble”

• Molecular simulations

– Levels of representation. Atomistic vs. Coarse-grained – Molecular dynamics algorithm(s)

– System preparation (+ Hands on) – Trajectory Analysis (+ Hands on)

• Simulation & HPC

– Algorithmic improvements

– Ensembles and replicated simulations – Simulation Databases

– Data management

(5)

Objectives and outline

14- March

09.00 - 10.30 Welcome & Introduction (JLG) 10.30 - 11.00 Break

11.00 - 11.45 Atomistic MD (JLG)

11.45 - 12.30 Improvements & HPC (JLG) 14.00 - 15.15 Simulation Setup (JLG)

15.15 - 15.30 Software installation (AH) 15.30 - 16.00 Break

16.00 - 18.00 Setup Hands On (AH)

15-March

09.00 - 10.30 Simulation DBs and Data Mgt. (PA) 10.30 - 11.00 Break

11.00 - 11.45 Application Examples (MW) 11.45 - 12.30 CG simulations (PD)

12.30 - 14.00 Break

14.00 - 16.00 Traj. visualization and analysis (AH) 16.00 - 16.30 Break

16.30 - 18.00 Report writing

(6)

Course Materials

• https://inbi-login.bsc.es/www/patc/

• Software to be installed locally:

– Linux O.S, VMD, python, conda – VM available

• Accounts on Minotauro (mt1.bsc.es) to execute simulations

– 64 Bull Blade B505 (62 + 2 login),

• 2 Intel E5649 (6-core/2.53GHz/12MB cache), 24GB RAM, 2 NVIDIA M2090, 1x SSD 250GB

– nct0XXX (XXX = 178 .. 195), pass: 4FSCn5.XXX – Access via ssh (sftp, scp)

(7)

MACROMOLECULAR DYNAMICS A KEY ISSUE FOR MOLECULAR RECOGNITION

(8)

DNA sequence Protein sequence

(9)

ATP (Mg) - ACV

(10)

ATP (Mg) - ACV

Structural rearrangement is necessary

for enzyme (protein) function

(11)

Macromolecules are dynamic entities

• Molecular recognition requires structural

adjustment

(12)

Binding modes

for Rofecoxib and Celecoxib to cyclooxigenase-2

Soliva R. et al. J Med Chem. 2003, 46 (8), pp 1372–1382

(13)

Dynamic properties. Why?

• Docking experiments are very sensitive to receptor structure

Acetylcholinesterase

RMSd docking solutions

(14)

-0,40 0,00 0,40 0,80 1,20

0,15 0,2 0,25 0,3 0,35 0,4 0,45 Comp 1

Comp 2 VDW

+ -

(15)

Component 2

-1 -0.5 0 0.5 1 -1

-0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

-1 -0.5 0 0.5 1

A B

C

E

H F D

G

1dbs 1byi

1phb

1bty 9xia

1xih

(16)

Carlson H.A. & McCammon J.A. (1999) Mol. Pharmacol. 57, 213-218

(17)

The concept of ensemble

• “Ensemble”: set of structures that represents ALL possible microscopic states of the system (or a significant sample of them)

• Thermodynamic properties can be deduced from the average of “ensemble”

properties.

(18)

Experimental ensembles?

(19)

Available structural information is “static”

– X-Ray: Macromolecules must have unique conformations to crystalize. Mobile regions do not have enough electron density to be detected.

• “Experimental” flexibility is given by B-Factors

– NMR: Is “in solution”, however conformation cannot vary too much, otherwise no enough restrains can be derived from experiment. Mobile regions are less defined.

ATOM 2537 CA GLY B 27 54.322 -6.951 4.465 1.00 48.11 C ATOM 2538 C GLY B 27 53.220 -7.408 5.430 1.00 23.01 C ATOM 2539 O GLY B 27 52.901 -6.745 6.433 1.00 17.59 O

(20)

Experimental ensembles

Xray: 1CM8 and other Prot. Kinases NMR: 1A03. Ca2+ Binding protein

(21)

Theoretical ensembles

• Atomistic Molecular Dynamics is the most used theoretical technique to account for dynamic properties of macromolecules

– Also Monte-Carlo, Normal mode analysis, Discrete dynamics, …

• Analysis of a single system along time is equivalent to the analysis of many copies of the same system (ergodic principle)

• Simple theoretical background.

• Fast to calculate even for big systems

(22)

Theoretical ensembles (MD and others)

Integration step 1 fts (10-15 seg) → 1 mseg = 1 Eur Billion integration steps Equivalent to follow evolution Neanderthal→ H sapiens with photos every sec

𝑓

𝑖

= − 𝜕𝐸

𝑖

𝜕𝑟

𝑖

𝑎

𝑖

= 𝑓

𝑖

𝑚

𝑖

𝑣

𝑖

= න 𝑎

𝑖

𝑑𝑡

𝑟

𝑖

= න 𝑣

𝑖

𝑑𝑡

(23)

Current limits

• Standard simulations are in ns-ms timescale with 100,000 to 500,000 atoms

• World Records:

– HIV Capsid 64 M atom 1.2 ms – BPTI 1 ms free simulation

(24)

Applications

• Structure optimization

– Refinement of XRay and NMR data.

• Conformational transitions and folding

• Flexibility studies

• Dynamics and function

– Allostery, induced fit

• Generation of theoretical ensembles

– Statistical thermodynamics

(25)

Predicted vs X-Ray Structures

(26)

0,00 1,00 2,00 3,00 4,00 5,00 6,00

0 200 400 600 800 1000

time (ps)

RMSd (A)

Protein

Loop 214-232

Conformational changes

(27)

TK - ATP (Mg) - ACV

Wat

Wat

0,0 1,0 2,0 3,0 4,0

0 500 1000 1500 2000

Time (ps)

RMSd (A)

ACV Mg

(28)

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0

0 500 1000 1500 2000

Time (ps)

Dist (A)

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0

0 500 1000 1500 2000

Time (ps)

Dist (A)

Some phenomena can only be

understood

from

dynamics

!

HBPG Aciclovir

Thymidine kinase catalysis

(29)

Statistics on MD ensembles

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0

0 500 1000 1500 2000

Time (ps)

Dist (A)

State A State C

A B B

A N

RT N G = − ln

Statistics replaces rational thinking!!

(30)

How does the enzyme control ligand access to the heme site?

Long branch (20Å)

B: Ile19, Ala24, Ile25, Val28, Val29, Phe32 E: Phe62, Ala63, Leu66

E-helix B-helix

H-helix G-helix

Short branch (10Å)

0.1 ms MD

AMBER 99 force field

Octahedral box (8500 TIP3P waters)

(31)

CMIP energy isocontour

Closed State Axis of the tunnel

(32)

Phe62

CMIP energy isocontour

Open State Axis of the tunnel

(33)

closed

Phe62

open

(34)

Free energy profile for NO diffusion along the main channel (steered MD simulations)

Heme

Free energy (kcal/mol)

Distance Fe-N(NO) (Å) Closed state

Open state

(35)

Molecular dynamics and HPC

• Present biology requires “all” to go High-Throughput

– Genome/Proteome-wide studies

– Plethora of genomic information available – Towards a “Dynamic” PDB

• Biological scales are already there…

(36)

Molecular dynamics and HPC. The bad news…

• MD algorithms are extremely simple. Several ways to paralelise, but trivial ones do not scale beyond 8-16 threads

• Present supercomputers have several thousands of cores available

– first computer (without GPUs) on the Top500 list holds 7,630,848 cores (Fugaku, Japan), second 10,649,600 cores (Sunway TaihuLight, China).

– with GPUs TOP1 2,414,592 cores (Summit, USA)

• (www.top500.org).

– Applications to get CPU time require to justify a fair use of a large amount of cores.

– No present strategy for MD optimization can scale in practical use to thousands of cores!!

• Rough rule: not less then 100 atoms per core.

(37)

Strategies

Increase granularity

• Coarse-grained simulations

Improve parallelization strategies

• Domain decomposition

• Accelerators (GPU’s )

Replicated simulations

• Reduce process intercommunication.

(38)

Massively parallel execution with biobb and PyCOMPSs

Pyruvate Kinase

~400.000 atoms MD

200 annotated mutations from Uniprot

~40,000 cores in BSC Marenostrum

Referencias

Documento similar

The normal mode analysis is carried out by the code (alternatively, the vibrational or vibronic analysis could be read from external files), and Cartesian coordinates are used

Different quantities were compared between data and Monte Carlo to test the validity of the Monte Carlo description of the data for events with large E / after the pre-selection cuts

• Most probably inner Galactic rotation curve from fitting 3365 stellar radial velocities in 14 APOGEE fields, with a kinematic disk model including asymmetric drift,

Fourier-Hankel/Legendre analysis is one of the helioseismic techniques employed to infer the internal properties of the Sun. It has been successfully applied to study

To date, the most used and reliable experimental approaches for structural determination of macromolecules are X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy,

(6) In order not to complicate in excess and unnecessarily the OKMC simulations, additional assumptions regarding the energetics need to be made: (a) we will ignore the binding

The thermal conductivity, specific heat, and specific volume of the orientational glass former 1,1,2-trichloro-1,2,2-trifluoroethane (CCl 2 F–CClF 2 , F-113) have been measured

For two different size Q = 1 instantons, on a 20 4 twisted lattice, we compare the instanton parameters determined from the 1-d profiles of the AFM density and the self-dual part of