Due to the recognised importance of carbohydrates in diverse biological processes, there has been an increase of research to study protein-carbohydrate interactions and elucidate the molecular mechanisms responsible for the binding recognition. With the increasing number of protein-carbohydrate structures characterized and deposited in the PDB in recent years, ~450 structures released since 2015 from a total of ~2000 protein structure entries assigned as carbohydrate-binding, great knowledge has been gained in understanding the common molecular features that govern the formation of protein-carbohydrate complexes.
The formation of these complexes is driven by favourable changes in enthalpy (ΔH) and entropy (ΔS), accompanied by an increase of the free energy (ΔG) of binding from monosaccharides to longer chain-length oligosaccharides113. Entropic penalties may also occur due to restricted
conformational freedoms of the protein and ligand62. Additionally, upon protein-carbohydrate
interaction, there is an increase in avidity due to a multivalent effect113.
It is well known that weak molecular forces are largely responsible for protein-carbohydrate recognition113,121–123. The most important interactions are van der Waals forces, electrostatic
interactions and hydrogen bonds, where the CH-π hydrogen bonding are detrimental for the binding events113,121–123.
Although sometimes overlooked when studying protein-ligand interactions, water-mediated hydrogen bonding have also been shown to influence protein-carbohydrate affinity124. Given its
unique ability to donate and accept two hydrogen bonds, water molecules mediate protein-ligand interactions. Ligands compete with a water molecule for binding to the protein binding site, where water molecules will be replaced, retained, or displaced to favour or not ligand binding125. Water
molecules impact the ΔG of binding both enthalpically, by displacing poorly ordered waters or forming newly ordered water network, and entropically, through hydrophobic effect124,125. Water
networks can make considerable contributions to the protein-ligand affinity, where perturbation of these networks, even without disrupting the ligand or the protein, can substantially decrease enthalpically optimal interactions and introduce solvent mobility, hence having an impact on the ligands’ binding124.
28
CH-π interactions involve a type of hydrogen bond between aliphatic and aromatic CH’s as the hydrogen donor, and the π-systems of arenes as acceptors113,121. These interactions have also
been referred to as ‘edge-to-face’, ‘T-shape’, ‘π-π-’ or ‘arene-arene’-interactions121.
Carbohydrate-aromatic CH-π interactions are described as stacking interactions due to the parallel orientation of the interacting carbohydrate and the aromatic rings123 (Figure 1.11).
Figure 1.11. Examples of carbohydrate-aromatic CH-π interactions. (A) Family 6 CBM from Cellvibrio
mixtus in complex with cellobiose (Glcβ4Glc) in its type C cleft (PDB ID: 1UYX)38, exhibiting CH-π stacking between both faces of the first glucopyranose ring from a Tyr33 and a Trp92 residue; and (B) Bacillus
halodurans family 26 CBM bound to maltose (Glcα4Glc) (PDB ID: 2C3H)126, exhibiting CH-π interactions between the top faces of the glucose rings and Tyr25 and Trp37, with Tyr23 contributing to hydrogen bonding. Representations (not to scale) of individual 3D structures were done with program Chimera40 using the PDB atomic coordinates.
Although usually weaker than other interactions, the CH-π effects define the protein-carbohydrate enthalpy of binding113. The side chains of aromatic amino acids Trp, Tyr, Phe and His, are typical
π-systems that can act as acceptor groups. The carboxyl or carboxamide side chains in Asp, Glu, Asn and Gln may also contribute. Additionally, main chain peptide groups, which constitute π-systems with some degree of delocalization, may also act as π-acceptors. The frequent occurrence of aliphatic and aromatic CH-π interactions, not just in proteins but as well in nucleic acids, membrane lipids and polysaccharides, suggests an important functional role121.
Analysis of protein-carbohydrate structures deposited in the PDB have revealed that many carbohydrate-binding proteins contain aromatic amino acid residues in their binding sites and that these residues interact with their carbohydrate ligands in a stacking geometry through CH-π interactions113,122,123. Recent studies have revealed that aliphatic hydrophobic residues in the
carbohydrate-binding sites are not favoured when compared to aromatic side chains, with a higher preference for Trp residues followed by Tyr122,123. Aromatic CH-π interactions can be found in
carbohydrate-binding proteins, such as CBMs, lectins and CAZymes, and are involved in a wide range of processes from carbohydrate-binding, catalytic processing and transport123.
β-D-glucopyranose is structurally predisposed for carbohydrate-aromatic interactions due to the
29
an aromatic system, like of Trp, Tyr and Phe, in a parallel stacking geometry123. This stacking
interaction can happen either at the top, the bottom or both faces of the carbohydrate ring (Figure 1.11A). On the contrary, α-D-glucopyranose only interacts with the carbohydrate ring top
face (Figure 1.11B) because the anomeric hydroxyl group blocks the bottom face123.
These carbohydrate-aromatic CH-π interactions have been defined as dispersion interactions, tuned by electrostatics and partially stabilized by a hydrophobic effect in solvated systems122,123.
As electrostatic interactions are highly influenced by directionality and charge distribution on donor and acceptor molecules, they can further strengthen and orientate the carbohydrate-aromatic complexes123. Additionally, because the electrostatic surfaces and the
electropositive characters of C-H bonds of the carbohydrates engaging in CH-π interactions differ between carbohydrate isomers, the aromatic side chains of the protein engage with different regions of the carbohydrate. This effect provides a mechanism for discriminating between carbohydrate monomers, influencing which bind to the protein and how they are positioned within carbohydrate-binding sites122.
31