• No se han encontrado resultados

1.4.1 Why high-field NMR in combination to computational tools is an ideal tool for protein-carbohydrate investigation

In the previous sections we have set the theoretical framework of this thesis, introducing the world of glycobiology and intermolecular interactions, on one hand, and the field of Nuclear Magnetic Resonance, with an eye on the phenomenon of NOE and its extensive application in structural studies, on the other hand.

Before stating the aims of the current work, it is worth emphasising the vast potential of NMR to study biologically relevant protein-ligand interactions. More than any other class of ligands, carbohydrates are complex and flexible entities with a high degree of stereochemistry which make them interesting subjects for NMR observation. Nevertheless, the chemical similarity of carbohydrate monomers, and of the protons within each ring, poses the problem of chemical shift overlapping: the proton chemical shift dispersion of carbohydrates is generally low, with chemical shifts concentrated between 3.2 ppm and 4.1 ppm for the ring protons H2 to H6, and between 4.4 ppm and 5.2 ppm for the anomeric ones64. Thus, to study carbohydrates at atomic detail, high-

field spectrometers are necessary. The 800 MHz Bruker spectrometer available at the School of Pharmacy at UEA is an optimal tool to investigate the protein-carbohydrate interactions.

A proficient level of expertise in the field of computational analysis is also required to model the 3D structure of the interactions in solution, on the bases of the structural

56 | P a g e experimental data. Molecular docking is a good starting point for providing 3D models for protein-ligand interactions and the suite Maestro Schrodinger has been reported to be the most efficient tool to model flexible carbohydrates in relatively shallow binding pockets65 (still, Molecular Dynamics is always the method of choice to account for the

full flexibility of the system). As we will see in the following chapter, validation of STD NMR data against the simulated 3D models is necessary, and the best available tool for this is a program called CORCEMA-ST (COmplete Relaxation and Conformational Exchange Matrix for Saturation Transfer), a MatLab code released in 2002 by the group of Rama Krishna66.

In our research group, the coexistence of these three fields of expertise, the availability of the 800 MHz Bruker spectrometer and the Maestro suite, as well as the expertise in the use of the CORCEMA-ST script (under licence) provided the perfect environment for undertaking the development of novel STD NMR approaches for the structural investigation of a number of biologically relevant protein-carbohydrate interactions at atomic detail.

In this environment and over the three years of this doctoral project, it has been very frequent to be in touch with research groups inside and outside the UK and to take small or large part into several collaborations. Only the two biggest (and most successful) works of biological relevance are included in this thesis: namely, a drug discovery project on the structural study of cholera toxin inhibitors; and the fundamental structural investigation of an intramolecular trans-sialidase from the gut microbiota. Still, the fact that many other side projects kept coming and going was an exceptional training and inspiration. This allowed us to study and observe many different biological systems from any natural kingdom, what constantly encouraged us to experiment further.

1.4.2 Our initial intentions of novel STD NMR methodology development

Investigating such a wide range of protein-ligand interactions was a strong stimulus to experiment with STD NMR, as we were pushed to tailor the technique to the features of every different system.

The initial aim of the present thesis was to expand the potential of Saturation Transfer Difference NMR, exploiting some novel conceptual ideas.

57 | P a g e Specifically, we wanted to answer the following questions:

1. It was known that different irradiation frequencies slightly affect the outcome of the STD NMR results. Can these differences be exploited? Can the inhomogeneity of spin diffusion help tracking the different pathways for direct and indirect saturation transfer, to get information on the architecture of the binding pocket?

2. Does the presence of protons from water affect the saturation transfer from the polar residues containing exchangeable protons?

3. Can we extract further information from STD NMR experiments with direct irradiation on the proton frequencies of the ligand? Can we observe intramolecular NOEs across the bound ligand on STD NMR experiments? Can we observe inter-ligand NOEs on STD NMR experiments, in a system containing two ligands bound to adjacent subsites?

4. In the case of positively answering those questions, can we provide a standardised protocol for the scientific community to implement our findings in their research?

The answers to these questions are reported in Chapter 3.

1.4.3 Cholera toxin inhibition: investigation of a novel class of GM1 antagonists

The first of the biological investigations undertaken involved the structural elucidation of the binding of a promising lead and its fragments to the GM1 binding subsite of the Cholera Toxin subunit B (CTB), coming from the research group of Inmaculada Robina (University of Seville).

In the wake of a long history of drug discovery studies aimed at designing GM1 antagonist to serve as CTB inhibitors and prevent the onset of the cholera infection, the particularity of the CTB binders investigated by us was their limited carbohydrate nature, making them more “drug-like” than many of the ligands proposed before.

Namely, the main lead was based on a scaffold containing a thio-galactose and a polyhydroxyalkylfuroate-aromatic moiety, designed to bind in the GM1 binding pocket, to the two well-characterised galactose and sialic acid subsites, respectively. In a

58 | P a g e previous work, on the bases of the STD NMR binding epitopes of the three ligands, a qualitative bidentate binding mode was postulated67.

Hence, the main question we wanted to answer was:

1. Can we determine the binding mode of the main lead and its fragments, quantitatively? Does the binding actually involve the two known binding subsites?

Our finding that the polyhydroxy moiety did not occupy the sialic acid binding subsite, but a novel groove adjacent to it, opened new questions:

2. Do the sialic acid and the novel sub-sites exclude each other, or do they coexist? Can they both be occupied at the same time?

3. Where is the source of the specificity of the novel class of ligands (the polyhydroxyalkylfuroate binding to the novel binding sub-site or the galactose binding to the galactose subsite)?

4. What is the impact of structural variations of the scaffold on the affinity of binding? How can penalties associated to the flexibility of the polyhydroxy chain be reduced?

The answers to these questions are reported in Chapter 4.

1.4.4 Investigating the specificity of an IT-sialidase from gut microbiota: structural study on the binding mode of syaloglycans

The research group of Nathalie Juge at the Quadram Institute of Biotechnology is strongly focused on the investigation of the gut symbiont Ruminococcus gnavus (R. gnavus), a mucin degrader with the ability of binding, hydrolysing and metabolising the sialic acid capping the glycans exposed on the mucus. The recently discovered link between R. gnavus and the inflammatory bowel disease brought renewed attention to this organism68.

In particular, we have studied a sialidase from R. gnavus, RgNanH. The enzyme consists of i) a carbohydrate binding module recognising sialic acid (CBM40), and ii) an enzymatically active domain (GH33), which converts the α2/3 sialic acid domain capping

59 | P a g e the mucins into a tricyclic derivative of sialic acid itself: 2,7-anhydro-Neu5Ac. For this feature of its enzymatic domain, RgNanH is defined as an intramolecular trans-sialidase (IT-sialidase), a class of enzymes currently containing three members in total.

We undertook two sub-projects, focusing on the enzymatic domain and on the carbohydrate binding module, with the aim of investigating the domains in terms of affinity, specificity and mechanism of recognition towards a library of α2/3 and α2/6 sialoglycans.

For GH33, our target questions were:

1. Which are the main elements of the ligands for molecular recognition? 2. Does GH33 select α2/3 sialoglygans over α2/6 s sialoglygans?

The finding that GH33 binds to both 3’-sialyllactose (3’SL) and 6’-sialyllactose (6’SL), despite the fact that only 3’SL is the hydrolysable substrate, opened more questions:

3. Do 3’SL and 6’SL bind to the same subsite? Which are their relative affinities? 4. What is the 3D structure of the complexes?

5. Which are the key elements of the interactions?

6. How to explain the specificity of the reaction for α2/3 sugars?

7. How to explain the prevalence of the intramolecular trans-reaction over the intermolecular trans-reaction?

For CBM40, our target questions were:

1. Which are the main elements of the ligands for molecular recognition? 2. Which is the minimum sugar entity recognised by CBM40?

3. What is the selectivity of CBM40 towards sialoglycan linkages?

4. How do sugar decorations (e.g., the presence of a glycolylic group in the place of the acetamide group on the sialic acid, the N-acetylation of the galactose and glucose moieties) affect the binding?

60 | P a g e

Chapter 2

Techniques and tools