6 LEGISLACION ESPAÑOLA
6.7 LEY 2/2007, DE 27 DE MARZO, DE FOMENTO DE LAS ENERGÍAS RENOVABLES
In mass spectrometry (MS)-based proteomics, peptide AA sequences are determined by performing MS2 scans, which isolate and subsequently fragment precursor ions. Frequently, only part of a precursor’s isotopic distribution is captured due to isolation windows that are too narrow or are offset relative to the
precursor. Experiments using data-dependent acquisition typically use isolation windows that are 1.4-4m/z
wide (Michalski et al., 2011; Scheltema et al., 2014). With a 1.4m/zwide isolation window, only one to three isotopic peaks of a charge +2 peptide can fit within its boundaries. For example, if the window is centered
>0.2m/zbelow the monoisotopic peak, then only the monoisotopic peak would be isolated. This can occur
for co-eluting peptides that were not the intended target of an MS2 scan because theirm/zposition relative to the isolation window is random. Since co-fragmentation is encountered in as many as 50% of MS2 spectra of complex samples, isolation of unexpected isotopes from co-eluting peptides is common (Houel et al., 2010). The isolation of only some isotopes leads to fragments with complex isotope distributions; these distributions depend on the subset of isolated precursor isotopes and the elemental compositions of both the precursor and the fragment of interest. While a general method to calculate the theoretical isotope distribution of a fragment has been developed, this method requires exact knowledge of those inputs (Rockwood et al., 2003). Typically,
peptide AA sequences and elemental compositions are unknowna priori. Therefore, computational tasks that
occur prior to sequence determination, including MS2 de-isotoping, monoisotopic mass calculation, charge assignment of fragment peaks, and chimeric spectra deconvolution, do not take full advantage of fragment isotopic distributions. In order to improve these pre-processing endeavors and to increase protein and peptide identifications, an efficient method is needed to approximate theoretical fragment isotope distributions based on observed peaks and isolation window parameters.
The isotope distribution of a molecule arises from the varying number of neutrons in its individual elements. In mass spectrometry, there are two types of isotope distributions to consider: precursor (or natural) isotope distributions and fragment isotope distributions. A molecule’s precursor isotope distribution is its
distribution of isotope abundances prior to fragmentation. After fragmentation, however, the isotope distribu- tion of a particular fragment molecule is called a fragment isotope distribution. When computing theoretical isotope distributions for either type, there are two further scenarios: either the elemental composition of the molecule is known, or it is not. When the elemental composition is known, its theoretical precursor isotope distribution can be computed using methods such as polynomial expansion, multinomial expansion or the fast Fourier transform (FFT) (Brownawell and Filippo, 1982; Yergey, 1983; Rockwood et al., 1995).
However, if a molecule’s elemental composition is not known, but is comprised of similar structural units such as amino acids or nucleotides, then its theoretical precursor isotope distribution can be approximated in one of two ways. The most common method is to first approximate the elemental composition using the Averagine model, which represents the elemental composition of an average amino acid weighted by frequency in the human proteome, and then to compute the corresponding theoretical precursor isotope distribution (Senko et al., 1995). A fractional Averagine model was later developed that allowed continuous values for element counts and therefore avoided discontinuities due to element rounding, but was also more computationally intensive (Renard et al., 2008). The second approximation method utilizes the relationship between mass and isotope ratios. In the case of peptides, approximate precursor isotope distributions are reconstructed by evaluating polynomial functions that are fit to the isotope ratios and masses of peptides
generatedin silico(Valkenborg et al., 2008; Ghavidel et al., 2014). However, because of its unique isotope
distribution, the number of sulfur atoms within a peptide creates a divergence in these patterns, particularly for shorter peptides. If the number of sulfurs can be determined, then a more accurate prediction of isotope ratios can be achieved by utilizing models that are fit specifically to peptides with the same sulfur count.
The second type of isotope distribution, fragment isotope distributions, arise from more than just the elemental composition of the molecule. During the isolation and fragmentation of an individual precursor isotopic peak, each precursor in the population has the same number of neutrons, but the locations of the extra neutrons vary. Consequently, the isotope distribution of a fragment depends on the stochastic arrangement of neutrons within the precursor. The isotope distribution of a specific fragment is governed by the probabilities of extra neutrons residing in the given fragment versus its complementary fragment. Isolating multiple precursor isotopic peaks adds further complexity, as the resultant fragment isotope distributions are linear combinations of the fragment isotope distributions stemming from individual precursor peaks. Conveniently, isolation of the complete isotopic distribution creates fragments whose distributions are equivalent to the fragment’s natural isotope distribution. For the case where the elemental composition of the precursor and
fragment are known, and only a single precursor isotopic peak was fragmented, software has been developed to calculate the theoretical fragment isotope distribution (Ramaley and Herrera, 2008). Unfortunately, utilization of this method has been minimal (Rockwood and Palmblad, 2013). Extending the framework to handle the fragmentation of multiple precursor isotopic peaks, as well as providing a method to approximate fragment isotope distributions will increase its utility and range of applications. Such opportunities exist in the pre-processing of MS2 spectra of unknown elemental compositions, whose methods often rely upon approximate precursor isotope distributions (Carvalho et al., 2009; Xiao et al., 2015; Chen et al., 2006; Horn et al., 2000; Zabrouskov et al., 2005; Liu, 2011; Kou et al., 2014; Yuan et al., 2011; Mechtler, 2016).
Here, I developed methods that approximate fragment isotope distributions when elemental compositions are not known. I re-derived the existing general framework for fragment isotope distributions of individual precursor isotopic peaks and then extended it for subsets of isotopes. Next, I incorporated the Averagine model within this framework in order to support biomolecules of unknown elemental compositions. Given that sulfurs have a large effect on the isotope distributions of small peptides, which are abundant in MS2 spectra, a I developed a sulfur-specific Averagine method and evaluated it on both precursors and fragments. Furthermore, I observed that individual precursor isotope probabilities followed a smooth non-linear pattern and summarized them with splines and used those splines in place of the Averagine model. I evaluated
the accuracy and speed of these onin silicodigested peptides, mass spectrometry experiments utilizing the
angiotensin I peptide, and in complex peptide mixtures from HeLa cells lysate.