Data presented in this study are based on the most comprehensive statistical
analysis of higher resolution Ca2+-binding structures available to date. While certain data
presented here with respect to EF-Hand proteins are generally consistent with previously reported studies, a clear distinction can be made between EF-Hand and non-EF-Hand proteins, based on the physical properties assessed. It is apparent from the data that non-EF-Hand CaBPs coordinate with fewer ligands, on average, than the EF-Hand proteins, and with a higher proportion of bound water molecules. Less formal charge is evident in the non-EF-Hand binding sites, which is expected given the lower proportion of charged sidechain ligands. It remains to be seen whether these properties can be
correlated with binding affinities. The EF-Hand sites additionally exhibit a bimodal distribution of sidechain Ca-O-C angles, which may be due to the abundant presence of Asp as a chelating ligand residue, which in turn may be conserved along evolutionary
lines. In both classes, the majority of Ca2+ ions are surrounded by a holospheric binding
geometry. In the case of EF-Hand proteins, this frequently involves a pentagonal- bipyramid geometry, whereas the non-EF-Hand binding sites exhibit less regular structure. The Ca-O Bond lengths for both classes were generally equivalent, but discrete differences were apparent in the bond angles, and in both cases the range of bond angles was narrower than previously assumed (Table 2.1). Additionally, the dihedral angles for non-EF-Hand and EF-Hand binding sites were generally equivalent, with low standard deviations, indicating that these values (168.1 ± 9.7 and 170.6 ± 7.1) may be utilized as input parameters for computational design.
The significant differences between ligand types (carbonyl, sidechain, bidentate), demonstrate the necessity of classifying these angles separately. Moreover, the small standard deviation in each case provides a narrower range of ideal angles for each ligand type, thus improving our input parameters used to design proteins with specific
Ca2+-binding characteristics.
The physical parameters and key characteristics associated with Ca2+-binding in
different classes of CaBPs identified from our analysis have two-fold significance. First, structural parameters derived from a more current, comprehensive data set provide a
more accurate representation of Ca2+-binding, particularly between different classes of
CaBPs. Second, these data will provide input parameters to both improve the accuracy of prediction algorithms and facilitate the design of engineered CaBPs with high
selectivity and affinity for Ca2+. The data compiled in this analysis have been directly
applied to define weighted coefficients used in a graph theory-based prediction algorithm
documented site, with 94% sensitivity and 93% selectivity [222]. The algorithm also correctly identifies only those ligands comprising the binding site in 45 out of 48 test sites. These results are in part attributable to refinement of the algorithm based on the availability of more precise structural parameters obtained from the statistical analysis reported in this manuscript.
Due to the ubiquitous presence of CaBPs in biological processes, and the roles
of Ca2+ imbalance in different diseases, the ability to predict and identify Ca2+-binding
sites using computational methods can accelerate our understanding of these processes
and problems, and subsequently improve our ability to alter Ca2+-dependent functions for
4 Statistical analyses of Pb2+-binding in proteins
4.1 Pb2+-binding protein statistics
Table A.6 (Appendix) lists the binding sites retained for analysis, their PDB identifiers, and resolution of the crystal structure. Table A.1 (Appendix) summarizes the PDB data by binding site for retained sites, including coordination number (CN) values both with (PLW) and without (PL) water molecules, formal charge (FC), and binding mode (D – displacement, O – opportunistic, or U - unknown) by site. As seen in Table
S6, approximately 1/3 of the Pb2+-binding sites were identified as sites of ionic
displacement, indicating that these sites are also known to bind physiologically-relevant ions, as listed in the Binding column. Statistical analysis of these two separate binding modes was not performed in this study due to limited data for each of the different metals listed.
Binding sites from the high-resolution dataset (DS HR) are identified by an asterisk preceding the PDB_ID. A charge of (-1) was assigned to acidic side-chain ligands Glu
and Asp, and the Cys thiol [218]. Table 4.1 presents a summary of all statistical data
from the analysis. A comparison of the values reported for DS HR and DS Final show little difference in ligand distance values, coordination number, and charge, indicating that resolution did not significantly alter the results. Consequently, unless otherwise specified, results from only the DS Final dataset are discussed from the analysis. These
statistical results for Pb2+ are then compared with data recently compiled for Ca2+ to
Table 4.1 Pb2+-binding statistics
DS HR DS Final
Total PDB proteins in study 7 21
Total Pb binding sites evaluated 27 48
Total target ligand atoms 105 177
Total Oaa ligands 86 118
Total OHOH ligands 16 36
Total N ligands 3 10
Total S ligands 0 13
Total sites with N ligands 3 9
Mean CN, PLW 3.9 ± 2.3 3.7 ± 2.0 % CN 2-5 77.8 77.1 % CN 6-9 22.2 16.7 Mean CN, PL 3.3 ± 2.0 2.9 ± 1.7 % CN 2-5 70.4 72.9 % CN 6-9 14.8 8.3
Mean charge by site -1.8 -1.7
Total identified bidentate pairs 24 36
Total sites with bidentate
ligands 21 30
% Sites with bidentate ligands 77.8 62.5
Mean distance, Pb-Oaa, (Å) 2.7 ± 0.4 2.7 ± 0.4
Mean distance, Pb-OHOH, (Å) 2.8 ± 0.3 2.8 ± 0.4
Mean distance, Pb-N (Å) 2.7 ± 0.3 2.6 ± 0.4
Mean distance, Pb-S (Å) --- 3.2 ± 0.3
DS HR: High resolution dataset (R ≤ 1.76 Å), 3.5 Å ligand-atom distance cut off. DS Final: Summary dataset, no restriction on resolution, 3.5 Å ligand-atom distance cut off,
refined for bidentate ligands. CN: Coordination Number. PLW: Ligands from protein and
water. PL: Ligands from protein only. Oaa: Amino acid oxygen ligand. OHOH: Water
oxygen ligand.