• No se han encontrado resultados

CARACTERÍSTICAS Y CONTINUIDAD DEL ITINERARIO PEATONAL ACCESIBLE EN LA

CONTINUIDAD EN LA ACCESIBILIDAD DE LOS ESPACIOS PÚBLICOS URBANIZADOS

3.1. CARACTERÍSTICAS Y CONTINUIDAD DEL ITINERARIO PEATONAL ACCESIBLE EN LA

101

Since Percolator is instantiated from MSPro, any number of peptide hits per spectrum could be submitted for re-scoring and hence re-ranking. As mentioned previously, all peptide hits (top 10) for both Mascot and Digger are automatically re-scored using the fast cross-correlation procedure prior to submission to Percolator. Whether only the top- ranked hit is submitted or multiple hits are submitted to Percolator is dependent to some extent on the data analysis task and/or search algorithm used. For example, if the aim is to determine how many spectra in a dataset have significant scoring multi-peptide identifications (because of complex mixtures or deliberate multiplexing) then multiple peptide hits per spectrum would be submitted to Percolator. For the Mascot algorithm this is implemented based on a user-defined delta score (i.e. a delta score of 1 would essentially allow all isobaric peptides to be submitted (assuming Q=K or I=L). A possible issue with this approach is that the total number of decoy hits might swamp the total number of target hits since the Mascot scores for decoy hits are very similar. A different approach is taken for Digger - a “rank” which indicates how many target peptide matches are considered outliers (with a maximum of 5) is output to the search result file for every tandem mass spectrum analysed. Using this approach, an identical number of target and decoy peptide hits per spectrum are submitted to the Percolator post-processing tool (Lukas Kall, personal communication). In the following section we explore how the re-scoring performs on large-scale peptidomics and proteomics data sets, relative to the default search algorithms raw score in the presence/absence of extended feature sets.

Large-scale peptidomics dataset 5.3.1.1

The large-scale peptidomics dataset used to illustrate the sensitivity and specificity of the Digger search algorithm in Chapter 4 was re-analysed using MSPro, in order to determine whether re-scoring provides an increase in significant scoring peptide identifications. The base feature set for both Mascot and Digger constitute the raw search algorithm scores plus standard features used to represent the peptide spectrum matches (see Chapter 3, Table 3.1). The extended feature set includes the cross- correlation features (i.e. XCorr, deltaCn and XCRank) and/or retention time (Rt) of the

102

PSM. A comparison of the base and extended feature set for both algorithms can be seen in Fig. 5.2, wherein it can be seen that for Mascot, inclusion of the XCorr features results in a dramatic improvement over the base and Rt-extended feature set. For Digger, the improvements are more modest, but the inclusion of the extended feature set provides a noticeable benefit. For these low mass-accuracy fragment ion peptidomics data sets, when only the top-ranked peptide hit and its extended feature set are submitted to Percolator for re-scoring, both the Mascot and Digger search algorithms perform similarly.

103

Figure 5.2: Analysis of the peptidomics data sets using Mascot and Digger and re- scoring of top-ranked peptide hits with base and extended feature sets by

Percolator.

The number of identified PSMs is shown as a function of Percolator’s q-value (global FDR) for both search algorithms. Base feature sets comprise the search algorithms primary score (blue and black lines) compared with extended features (Retention time only – red and cyan lines; Rt plus XCorr metrics – green and mauve lines).

Large-scale proteomics dataset 5.3.1.2

The large-scale proteomics data set (high-mass accuracy fragment ions) used to illustrate the sensitivity and specificity of the Digger search algorithm in Chapter 4 was

0.000 0.005 0.010 0.015 0.020 1000 1500 2000 2500 q-value N um ber of i dent if ied P S M s Digger_default Digger_Rt Digger_Rt_XCorr Mascot_default Mascot_Rt Mascot_Rt_XCorr

104

re-analysed using MSPro, in order to determine whether extended feature sets (XCorr metrics and Rt) and Percolator re-scoring of top-ranked peptide hits provides an increase in significant scoring peptide identifications. Firstly, for these data sets where the fragment ions were analysed in the Orbitrap with high resolution and high mass accuracy, the Digger search algorithm can be seen to easily outperform Mascot (Fig. 5.3 (A) and (B)), irrespective of feature set. For both search algorithms, there is a noticeable increase in identifications when additional features are included, with the extended feature set (XCorr plus Rt) performing the best. These results are perhaps not that surprising when one considers recent anecdotal evidence in the literature regarding re- scoring using extended feature sets [133]. Based on these results, it is clear that the Mascot algorithm derives the most benefit from Percolator re-scoring based on extended feature sets, and that the XCorr metrics serve as useful additional features resulting in an increase of significant scoring peptide identifications. The Digger algorithm, on the other hand, does not benefit as much (Fig. 5.3 (B)) because its scoring function takes advantage of the discrimination afforded by the high-mass accuracy fragment ions and PPM fragment ion tolerance.

106

Figure 5.3: Analysis of the proteomics data sets using Mascot and Digger and re- scoring of top-ranked peptide hits with base and extended feature sets by

Percolator.

The number of identified PSMs is shown as a function of Percolator’s q-value (global FDR) for both search algorithms. Base feature sets comprise the search algorithms primary score (blue and black lines) compared with extended features (Retention time only – red and cyan lines; Rt plus XCorr metrics – green and mauve lines) (A) and Digger results only (B).