Given the extensive occurrence of basic residues in the G3 CRD and SCR models, electrostatic potentials were calculated to determine patterns of negatively-charged groups in glycosaminoglycans that might match the arrangement of these basic residues. While basic residues have been implicated in the formation of link protein-hyaluronate interactions (Hardingham et a l, 1976; Lyon, 1986), protein-carbohydrate interactions can also be mediated by hydrogen bond and hydrophobic contacts, and no information on these will be available from electrostatic potentials. Hyaluronate contains one carboxyl group per disaccharide GlcA-GlcNAc. Four sets of coordinates fi'om fibre diffraction studies of hyaluronate in triclinic, tetragonal or orthorhombic unit cells (Brookhaven codes: Ihya, 2hya, 3hya, 4hya) were analysed to determine the average distance between adjacent pairs of carboxyl groups in linearly extended conformations of hyaluronate. Irrespective of the rotational twist between pairs of saccharide residues, the separation of neighbouring carboxyl groups was determined as 0.91 ± 0.10 nm from 10 pairs. Since these carboxyl groups are rotated by 120° or 90° along the long axis of the oligosaccharide chain, this means that carboxyl groups will appear on the same side of the oligosaccharide chain for possible interactions with basic protein residues at an
Figure 5.7; [Below]
Molecular graphics modelling of the SCR in G3. «-Helices and p-strands are identified as in Figure 5.5. The positions of the two disulphide bridges is denoted by thick black lines. The positions of the conserved basic residues Arg/Lys21, Arg/Lys23, Arg31, Arg/His41 and Arg47 are marked by spheres.
Arg47
Arg23
appropriate multiple of 0.91 nm. Potential protein-hyaluronate interaction sites can therefore be predicted if basic residues are found that are spaced apart by about 0.91 nm, 1.82 nm or 2.73 nm, depending on the relative orientation of protein and hyaluronate. This is visualised in electrostatic maps for hyaluronate which showed its carboxyl groups as negative lobes that were spaced apart by 0.91 nm if the contour scale (Section 5.2) was changed from ±2 kTle to ±5 kHe. By comparison, similar calculations for chondroitin 4-sulphate disaccharides GlcA-GalNAc. SO^ (lc4s, 2c4s) yielded inter carboxyl group distances of 1.1 nm, and inter-sulphate group distances of 1.3 nm, and the heparin dodecasaccharide IdoA. SO^-GlcN. (SOJ2 structure (Ihpn) gave an inter carboxyl group distance of 0.99 nm, and inter-sulphate group distances of 0.6-1.4 nm.
Electrostatic surface potentials were calculated from the crystal structures of mannose binding protein and E-selectin. Both proteins demonstrated relatively neutral surfaces with isolated small areas of positive and negative charge. Mannose-binding protein showed a large intense region of negative charge at the top of the CRD fold which corresponds to the two Ca^^ binding sites 1 and 2 (Figure 5.5c). E-selectin also showed a similar but smaller region of negative charge that corresponds to the single Ca^^ binding site 2. The G3 CRD model exhibited a negative surface charge at the Ca^^ binding sites 1 and 2 that was similar in size to that seen for mannose binding protein, as expected. However, in contrast to the two crystal structures, the G3 CRD model exhibited a greater asymmetry of surface potential. A large area of positive charge occurred close to the conserved residue Arg24 that was not present in the two crystal structures. Two other areas of positive charge occurred close to the conserved residues Arg29 and Arg76 (Figure 5.5a). The distance between the (-carbon atoms of Arg24 and Arg29 was estimated as 2.1 nm, and that between the (-carbon atoms of Arg29 and Arg76 was 2.0 nm. The corresponding values from the separation of the a-carbon atoms were 1.4 nm and 2.2 nm respectively. Even though the (-carbon separations are not well defined for reason of sidechain flexibility, they are comparable with twice the 0.91 nm separation of carboxyl groups in hyaluronate and other glycosaminoglycans, or with a 1.3 nm separation of sulphate groups in chondroitin sulphate.
Calculation of the surface potentials from the NMR structures of two SCR domains in complement factor H showed that, although both exhibited small differences in electrostatic potential, positively- and negatively-charged areas were evenly distributed on their surface. In contrast to this, the predicted G3 SCR model exhibited an asymmetric distribution of surface potential, in which one side of the P-sheet sandwich exhibited three large areas of positive charge, while the other side of the structure was more negatively charged. These three areas coincided with the conserved pair of residues at Lys21-Arg23, those at Arg31-Arg47, and the single residue at Arg41 (Figure 5.7). The separations of the (-carbon atoms were 2.3 ± 0.3 nm for the Lys21-Arg23 and Arg31-Arg47 pairs, and 1.6 ± 0.1 nm for the Arg31-Arg47 pair and Arg41. The a- carbon separations were 1.6 ± 0.2 nm and 1.5 ± 0.2 nm respectively. Even on making allowance for sidechain flexibility, these distances are again comparable with the separation of carboxyl or sulphate groups in glycosaminoglycans.
A model for G3 was formed from the linear extension of the C-terminal B7 P-strand of the CRD model into the extended N-terminal p-strand predicted just before the first Cys residue of the SCR model. A similar extended polypeptide link joins the CRD and epidermal growth factor domains in the crystal structure of E-selectin (Graves et a l,
1994). The length of the linker polypeptide is sufficient to suggest that the steric connection between the CRD and SCR domains is relatively unconstrained. While no information is available on interdomain orientations, the simplest connection between the two domains (by analogy with E-selectin) resulted in the creation of a cleft between the two domains that is lined by basic residues (Figures 5.8 and 5.9). Even if the SCR domain were to be rotated about its longest axis by 180° relative to the CRD domain, the presence of Lysl 33 on the opposite face of the CRD domain would lead to a positively- charged cleft between the two domains. The six positively-charged surface areas in the two domains do not form a continuous band of charges separated by about 2 nm. Nonetheless the formation of such a cleft lined by positive charges appears to be a distinctive property of the G3 region in aggrecan and other related proteoglycans. This is suggestive of a functional role for G3 that involves its interactions with a strongly anionic glycosaminoglycan present in cartilage, rather than with carbohydrate through
Figure 5.8: [Overleaf] Molecular model o f a possible interdomain structure for the CRD and SCR domains in G3 of human aggrecan.
Figure 5.8: [Below]
Molecular model of a possible interdomain structure for the CRD and SCR domains in G3 of human aggrecan. The two domains are linked by an extended polypeptide strand. The sidechain structures of conserved basic residues are shown in full (A; CRD domain; B ; SCR domain), and secondary structures are labelled as in Figures 5.5 and 5.7.
CRD
Arg76A Arg40A _sac
A r g 2 4 A \ „ L v s 2 1 B Arg47B Arg41B Arg31BFigure 5.9: [Overleaf] Electrostatic potential maps for the model o f G3 in human
Figure 5.9: [Below]
Electrostatic potential maps for the model of G3 in human aggrecan. The model of Figure 5.8 is shown in three views rotated in successive 90° steps about the vertical axis compared to the starting position shown in Figure 5.8. Red: negative charge; blue: positive charge.
the binding sites in the CRD domain. From an evolutionary standpoint, the binding of G3 to hyaluronate, chondroitin sulphate and keratan sulphate is unlikely for reasons of the existence of G1 which binds to the former, and large amounts of chondroitin sulphate and keratan sulphate that are found in aggrecan. It is more likely that G3 has evolved to bind to an oligosaccharide, perhaps to a sulphated glycosaminoglycan such as heparan sulphate, or to a sialylated oligosaccharide, or to some other anionic ligand.
The size of the G3 model is consistent with electron microscopy data on G3. The dimensions of the model were 7.0 nm x 5.7 nm x 3.9 nm. These are comparable with the diameter of 8.3 ± 1.3 nm reported for chicken G3 after rotary shadowing (Dennis et a l, 1990), and the range of diameters of 8 to 11 nm (± 2 nm) likewise reported for several types of bovine G3 (Morgelin et al, 1989). While the decoration effect in rotary shadowing leads to a systematic overestimation of 2-3 nm in diameter, it can be seen that the G3 model will account for the observed size of G3 in electron micrographs. If an EGF domain is present in G3, the longest dimension would be increased maximally by 3 nm (factor IXa; Brookhaven code Ipfx). This may account for the higher diameters reported by Morgelin et a l (1989) for G3 from sclera and tendon proteoglycans. The G3 CRD model has dimensions of 4.9 nm x 4 .7 nm x 4.I nm. These would be consistent with the smaller diameter of 3.5 ± 1 nm for the G3 domain reported by Paulsson et al (1987).
5.4. DISCUSSION