2.1.4 Drogadicción ¿Qué es la Drogadicción?
2.1.4.7 Vías de administración de las drogas.
(range of Affymetrix signals: 356-47515, median: 2046). Figure 4.13 demonstrates the number of CLL samples that expressed the cognate transcript encoding these proteins. In addition, 10 examples of these proteins are shown in Table 4.6. This analysis increased the confidence of the protein identification based on a single peptide in a single experiment.
4.2.6 Analysis of the most and the least frequently identified proteins
Different factors such as poor solubility or low abundance are associated with the difficulty of identifying some proteins by mass spectrometry (Issaq, 2001, Brewis and Brennan, 2010). To explore why some of the 900 proteins were identified in only one MALDI mass spectrometry analysis with a single peptide ID (108 proteins), independent published transcriptomic data of six CLL samples (Huttmann et al., 2006) were used. The Affymetrix signals of the mRNA encoding the 900 proteins were used to potentially reflect the abundance of their protein products. The list of the 900 proteins were sorted in descending order according to the number of times the protein was detected in MALDI mass spectrometry analyses, followed by peptide count and best ion score (BIS) C.I.%. The Affymetrix signals (mean in six CLL samples) of the mRNA encoding the top proteins (n= 110: found in 10-20 MALDI MS runs with multiple peptides) were compared to that of the
62%! 24%! 4%! 3%! 7%! 6 CLL samples! 5 CLL samples! 4 CLL samples! 2 CLL samples! 1 CLL sample!
Figure 4.13: The number of CLL samples that express transcripts encoding the proteins that were identified with a single peptide in a single experiment. Transcriptomic data derived from 6 CLL samples were
used to add more confidence to the protein identification based on one peptide in a single experiment (108 proteins). Of the 93 of these proteins that had match with Affymetrix IDs, 99% had a cognate transcript expressed in CLL samples. Transcriptomics data were taken from a previously published study (Huttmann et al., 2006).!
Protein name!
Accession
Number! TIS! BIS!
BIS C.I. %! Average of Affymetrix signal ! Sample count!
General vesicular transport
factor p115! USO1_HUMAN! 102! 103! 100! 4091! 6! Eukaryotic translation
initiation factor 3 subunit I! EIF3I_HUMAN! 94! 94! 100! 3284! 6! Chromatin target of PRMT1
protein! CHTOP_HUMAN! 89! 89! 100! 3677! 6! Mps one binder kinase
activator-like 1B! MOL1B_HUMAN! 83! 83! 100! 11781! 6! 26S protease regulatory subunit 4! PRS4_HUMAN! 81! 81! 100! 3434! 6! E3 SUMO-protein ligase RanBP2! RBP2_HUMAN! 47! 48! 99.92! 3644! 6! Kinectin! KTN1_HUMAN! 48! 48! 99.92! 6707! 6! Microsomal glutathione S- transferase 3! MGST3_HUMAN! 48! 48! 99.92! 2141! 6! NADH dehydrogenase [ubiquinone] 1 alpha
subcomplex subunit 6! NDUA6_HUMAN! 48! 48! 99.91! 3118! 6! RNA-binding protein 4! RBM4_HUMAN! 47! 47! 99.91! 2676! 6!
Table 4.6: Transcriptomic data of CLL cells support protein identifications based on one peptide in a single MALDI mass spectrometry analysis. !
Independently published transcriptomic data derived from six CLL samples (Huttmann et al,. 2006) were used to check whether CLL samples expressed the transcript encoding the proteins that were identified with a single peptide in a single experiment. The analysis showed that the vast majority of proteins that were identified with a single peptide in a single experiment had a transcript expressed in CLL samples. This table shows 10 examples of these proteins with the highest and lowest BIS C.I.%. TIS: total ion score, BIS: best ion score, BIS C.I.%: best ion score confidence interval percentage.!
! "#$!
bottom proteins (n= 108: detected in a single MALDI MS run with a single peptide). The analysis demonstrated that the amount of mRNA of the most frequently detected proteins was 3.3 times more abundant than that of the least frequently detected proteins (p= 5.4 × 10−7 using unpaired Studentʼs t- test; Figure 4.14). This analysis suggested that low abundance might have limited the identification of some proteins to only one MALDI mass spectrometry analysis based on a single peptide.
4.2.7 NP40 and SDS fractions: similar number of MS/MS spectra
with different number of protein identifications
More proteins were consistently identified in the NP40 fraction than in the SDS fraction. To understand why this might occur, Protein Pilot software was used to analyse the number of spectra utilised and the number of distinct peptides identified in four NP40 fractions and four SDS fractions. Interestingly, a similar number of spectra were identified in the NP40 fractions (54%) and in the SDS fractions (46%). However, the number of distinct peptides sequenced from the NP40 fractions was almost 1.7-fold greater compared to the SDS fractions (Table 4.7). This implies that the difference in protein IDs reflects the internal complexity and relative protein abundance rather than any technical difference caused by the differential detergent extraction.
p= 5.4 × 10−7 #
n= 218 proteins#
Figure 4.14: Abundance of the most frequently and least frequently identified proteins. Affymetrix signals of the mRNA encoding the most and
least frequently detected proteins were used to potentially indicate the abundance of these two groups of proteins. The analysis was conducted on 218 proteins; 110 were found in 10-20 mass spectrometry analyses with multiple peptides, while 108 were detected in a single mass spectrometry analysis with a single peptide. This figure shows that the amount of mRNA of the most frequently detected proteins was 3.3 times more abundant than that of the least frequently detected proteins (mRNA average ± SD: 13062 ± 15670 versus 3991 ± 6246). The transcriptomics data were obtained from a previously published CLL study (Huttmann et al,. 2006) #
0# 5000# 10000# 15000# 20000# 25000# 30000# 35000# Proteins detected in 10-20 mass spectrometry analyses
with multiple peptides#
Proteins detected in 1 mass spectrometry analysis with
single peptides# G e n e e xp re ssi o n (Af fyme tri x si g n a l) # 133
Total numbers! % Total!
NP40 Fractions! SDS Fractions! NP40 Fractions! SDS Fractions!
Total spectra! 25910! 21236! 55! 45! Non-empty spectra! 22777! 18782! 55! 45! Spectra identified >95% confidence! 11063! 9387! 54! 46! Distinct peptides >95% confidence! 3282! 1959! 63! 37!
The proteomic analyses of either 4 pooled SDS fractions or 4 NP40 fractions were separately analysed using Protein Pilot software coupled with the Paragon search algorithm to obtain the statistical summary of the spectra and distinct peptides. This table shows similar numbers of utilised spectra in the NP40 fractions and the SDS fractions, but a greater number of distinct peptides were identified in the NP40 fractions compared to the SDS fractions!
Table 4.7: Summary of the total spectra and distinct peptides that were identified in the NP40 fractions and SDS fractions. !
! "#%!
4.2.8 Localisation of proteins identified in the NP40 fractions and
SDS fractions
The rationale for developing the cellular fractionation method was that the NP40 fraction would be enriched with cytoplasmic proteins, while the SDS fraction would be enriched with nuclear proteins. This hypothesis was tested by using Gene Ontology data via the Quick GO-EBI tool (http://www.ebi.ac.uk/QuickGO/), to analyse the localisation of proteins that were uniquely found in each fraction (NP40 fractions = 541 proteins; SDS fractions = 138 proteins). Only proteins with one Gene Ontology location were analysed, as proteins with multiple locations would not be informative about the precision of the cellular fractionation procedure. Figure 4.15 shows that the NP40 fraction was predominantly comprised of proteins denoted cytoplasmic (48%) and also includes proteins denoted membrane (23%), nucleus (15%) and mitochondria (13%). In contrast, 82% of proteins from the SDS fraction were denoted as having a nuclear location.
4.2.9 Relationship between Affymerix signal and the feasibility of
identifying a protein by mass spectrometry
In an attempt to establish a relationship between a transcript level of a gene and the possibility of identifying its protein product by mass spectrometry, gene expression profiles of six CLL samples derived from good and poor prognosis patients (Huttmann et al., 2006) and the CLL proteomics data (900 proteins) generated in this study were used. The analysis was only conducted for genes with an absolute call “present” that were identified in at
0! 10! 20! 30! 40! 50! 60! 70! 80! 90! 100!
Cytoplasm! Membrane! Nucleus ! Mitochondrion!
Pe rce n t o f Pro te in s! Cell Component ! NP40! SDS!
Figure 4.15: Cell component analysis of identified proteins.
Localisation of proteins that were uniquely identified in the NP40 fractions and in the SDS fractions was determined from Gene Ontology data via the Quick GO-EBI tool. The analysis confirmed that the NP40 fraction was enriched with cytoplasmic proteins, while the SDS fraction was enriched with nuclear proteins.!
! "#'!
least three different CLL samples. After converting Affymetrix IDs into UniProt IDs, transcripts were grouped according to their Affymetrix signal into six groups (100000-33000, 33000-11000, 11000-3666, 3666-1200, 1200-400 and 400-0). The CLL proteomics data were used to determine the percentage of identifications in each group (Figure 4.16). The analysis demonstrated that the higher the Affymetrix signal for a transcript, the greater the chance of identifying its cognate protein in the proteome list generated from my work. Table 4.8 shows the Affymetrix signal intensity of some genes known to be important in the pathology of CLL, and the probability of identifying their cognate protein products by MALDI mass spectrometry.
4.2.10 Qualitative proteomics and transcriptomic data highlighted