In this section a brief analysis on the expression values obtained independently from exon arrays and RNA-Seq is presented. Both cancer datasets assayed with RNA-Seq were compared to an exon array from the same tumor sample. In total one GBM and eight DLBCL exon arrays were analyzed using MEAP.
In Figure 14 the frequencies in expression values from one exon array and one RNA- Seq samples are shown. Both 14a and 14b correspond to the lymphoma dataset. It can be observed that RNA-Seq expression values have a wider range than exon arrays which results in more reliable estimates and fold change for highly expressed genes [RBY+12].
(a) Exon array. (b) RNA-Seq.
Figure 14: Exon array and RNA-Seq example of expression values frequencies. RNA-Seq frequency values appear to be normally distributed, while it was not the case with neither the GBM (not shown) nor the lymphoma exon array sample.
The Spearman correlation for the expression values in log2 in both technologies was 0.71 for GBM and 0.85 for lymphoma. These correlation values are con- sistent with previous comparisons of exon array and RNA-Seq in the literature [MMM+08, GGM+10, NPP+12]. Additionally, as reported in [NPP+12] the differ- ences in expression are not overly biased towards certain genes. The differences in expression levels from both technologies seem to equally affect low or highly ex- pressed genes. Table 2 shows the correlation values for the GBM set before and after quality control at gene and isoform level. The RNA-Seq samples have been grouped in high, medium or low depending on their correlation level with the exon array sample.
Samples Gene NoQC Isoform NoQC Gene QC Isoform QC
high 1 – 24 0.71 0.23 0.71 0.22
29 – 36
medium 25 – 28 0.45 0.21 0.36 0.24
low 37 – 38 0.15 0.06 0.23 0.19
Table 2: Summarized table of the correlation between RNA-Seq and exon array for genes and isoforms, both for the raw files (NoQC) and after being preprocessed for quality control. Samples have been grouped in high, medium and low correlation. Correlation values between QC and NoQC sets are not significantly different. Gene correlation values are consistent with the quality of the data files, low quality files present a lower correlation than high quality ones. It can be observed that correlation values are concordant as well with the alignment times from Figure 12. Samples in the low group took the longest to align, followed by the samples in the medium correlation group.
Figure 15 shows RNA-Seq expression values at gene level in comparison with exon array for one sample of both GBM and DLBCL datasets. RNA-Seq expression values reported by Cufflinks are consistently higher than exon array values. Saturation, cross-hybridization or missing probes in exon array could explain these differences, as well as overestimation of expression values for small transcripts by Cufflinks as mentioned earlier.
(a) GBM. (b) DLBCL.
Figure 15: Comparison between exon array and RNA-Seq expression values. The x = y line is included as reference.
The total number of genes and isoforms that were found to be expressed both in GBM and lymphoma are shown in Figure 16. In all cases, the majority of genes/isoforms were detected by both technologies, but it can observed that exon array does not report expression values for a significant number of genes/isoforms. Differences at splice variant level are to be expected due to the algorithms for re- constructing the mRNA transcripts.
(a) Genes in GBM. (b) Isoforms in GBM.
(c) Genes in DLBCL. (d) Isoforms in DLCBL.
Figure 16: Comparison of expression of genes/isoforms in RNA-Seq and exon array. The Venn diagram show how many genes in (a) and (c) and how many isoforms in (b) and (c) were found to have a positive expression value in each assay. In all cases, both technologies detected most of the genes/isoforms. A significantly higher number of isoforms were detected by RNA-Seq.
4.3
Differential expression
The last step in the analysis of the lymphoma dataset was to find differentially expressed genes between patients in remission and patients that have relapsed after chemotherapy. For this analysis Cuffdiff was used taking advantage of the fragment bias correction algorithm. In Table 3, the genes that were found to be significantly differentially expressed are shown. This table is a summarized version of Cuffdiff’s gene_exp.diff output file.
The genes in Table 3 have been classified according to functional annotations found in GENATLAS [Fre86], a gene database that compiles information from the Human Genome Project (HGP) [ISC01, VAEA01] and published literature. From the 17 genes in the table, six were associated to tumor progression (*), four with immune response or inflammation (-), and two with cell cycle (+), namely cell division and apoptosis.
Gene C1 C2 Value1 Value2 fold change p value q value FMO2 rel rem 0.330696 2.1235 2.68287 1.20956e-05 0.0118096 +CENPM rel rem 61.0118 26.119 -1.22399 4.22691e-09 1.92592e-05 SLC8A3 rel rem 2.59085 0.082365 -4.97525 6.14016e-07 0.000932553 *IL7 rel rem 3.66544 8.25381 1.17107 2.38937e-08 8.16506e-05 *WNT2 rel rem 2.84697 0.29141 -3.28831 4.33128e-05 0.0328912 *CA9 rel rem 4.74904 0.387615 -3.61494 3.75464e-07 0.000733174 SPRYD5 rel rem 0.0341405 4.41964 7.0163 4.01191e-05 0.0322581 CDHR3 rel rem 0.728758 13.8944 4.25292 2.192e-07 0.000499373 SUMF2 rel rem 114.863 68.3189 -0.749551 7.70301e-07 0.00105292 *FGFBP1 rel rem 43.9677 0.0462359 -9.89321 4.06736e-08 0.000111194 ABCA6 rel rem 32.2312 3.05142 -3.40091 7.41978e-10 5.07105e-06 *-IL23R rel rem 3.19042 0.226403 -3.81679 5.23704e-07 0.000894813
*GUCY1A3 rel rem 2.24613 4.12845 0.878158 0 0
+CIDEA rel rem 0.121502 6.86287 5.81976 9.06038e-06 0.0101298 -IGHV1-2 rel rem 3.39023 645.112 7.57202 3.60816e-05 0.0308249 -IGKV3-11 rel rem 2.76715 872.313 8.3003 9.63405e-06 0.0101298 -IGKV1-9 rel rem 0.696892 145.405 7.70492 2.09229e-05 0.0190663
Table 3: Differentially expressed genes in relapse and remission patients in lymphoma. Functional annotation of genes relevant to cancer is shown in the table with + for genes associated with cell cycle such as cell division or apoptosis, * for genes associated with tumor progression, and - indicates genes related to immune response or inflammation.
Interleukin 23 receptor (IL23R) is an interesting candidate for further analysis since it has been associated with both cancer progression and immune response. The protein encoded by IL23R is necessary IL23 signaling. In turn, IL23 is a cytokine part of the immune response to infection and it is also known to increase angio- genesis. Higher expression of IL23 or IL23R, in comparison to normal tissue, has been reported in several cancers and has been associated with tumor progression and metastasis [LZW+06, KXK+09, LZW+11]. Figure 17 shows the differential
expression at exon level of IL23R obtained with DEXSeq.
Figure 17: Interleukin 23 receptor. The red and blue lines do not correspond directly to exons, but to counting bins, red for relapsed patients and blue for remission. Underneath the counting bins the flattened gene model of IL23R is included.
5
Discussion
A framework capable of efficiently organizing large datasets, handling parallelization, and with the flexibility to keep each processing step as a separate module is of great aide in any deep sequencing experiment. The Anduril framework provided these characteristics to the RNA-Seq data analysis workflow described in this work. Parallelization is essential when working with large datasets, but it is not the only way to speed up the processing time. In RNA-Seq experiments, the alignment of the reads to the transcriptome is a crucial step in the analysis and one of the most computationally expensive. The quality control module refines the alignments while reducing the overall processing time in two ways. First, whole datasets were discarded early on in the process if their overall quality was too low, precluding the need of further processing them. Second, Bowtie can be significantly slowed down by poor quality base-calls in the reads [LTPS09], but this is avoided by the trimming and filtering steps of the QC module. In the GBM case study the alignment time was efficiently reduced by 30 hours. The processing time of the remaining tasks was also reduced since only a subset of the original dataset was further analyzed. The comparison with exon arrays was based on the expression values for genes and isoforms delivered by the pipeline. It was shown that our workflow performs as expected when RNA-Seq has been compared to exon arrays, and in fact the correlation obtained in the DLBCL dataset was higher than the one obtained using ALEXA-seq in [GGM+10]. Since RNA-Seq is not limited, like exon arrays, by the knowledge of the genome during probe design, it was observed that more genes and isoforms were detected by RNA-Seq in both datasets.
The automated RNA-Seq data analysis workflow presented in this work proved to be competent in identifying differentially expressed genes between two conditions in cancer samples. The enrichment towards tumor progression shows that the efficient analysis of RNA-Seq enables the identification of interesting candidate genes for fur- ther study and validation. IL23R identified both by Cuffdiff and DEXSeq has been also associated with cancer progression and poor outcome to therapy in pancreatic cancer [VNG+12] and according to [LZW+11], IL23 is a good candidate for cancer immunotherapy.
It is known that chromosomal rearrangements are common in cancer and the work- flow described in this work is not capable of identifying such events. Future work for the framework will be to add module for finding fusion genes and later on trying
a combined approach of genome-guided with de novo assembly that will shed more light on the cancer genome. Furthermore, the base pair resolution of RNA-Seq is not being exploited at its fullest in our computational framework. A module for calling variants would be a great addition to the pipeline, and in particular one that can make use of the wealth of information provided by the ENCODE project [Dun12]. Future advancements in library preparation protocols for RNA-Seq, as well as the increase in sequencing lengths will necessarily impact the methods developed for deep sequencing transcriptomics analysis. Aligning reads that span several exons may become more difficult to tackle. On the other hand, transcript assembly may become more reliable. Additionally, further decrease in sequencing costs may guar- antee a higher number of biological replicates that would give more certainty to differential expression analysis. Systematic and scalable computational frameworks for RNA-Seq analysis will be a necessity for the efficient handling and processing of even larger datasets.
References
Aff05 Affymetrix, GeneChip Human Tiling Arrays, 2005. URL http://media.affymetrix.com/support/technical/datasheets/ human_tiling_datasheet.pdf.
Aff05a Affymetrix, GeneChip Exon Array, 2005. URL http: //www.affymetrix.com/support/technical/technotes/exon_ array_design_technote.pdf.
AFP+04 Alba, R., Fei, Z., Payton, P., Liu, Y., Moore, S. L., Debbie, P., Cohn,
J., D’Ascenzo, M., Gordon, J. S., Rose, J. K. C., Martin, G., Tanksley, S. D., Bouzayen, M., Jahn, M. M. and Giovannoni, J., ESTs, cDNA microarrays, and gene expression profiling: tools for dissecting plant physiology and development. The Plant Journal : for cell and molecular biology, 39,5(2004), pages 697–714. URL http://www.ncbi.nlm.nih. gov/pubmed/15315633.
AH10 Anders, S. and Huber, W., Differential expression analysis for sequence count data. Genome Biology, 11,10(2010), page R106. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=3218662\&tool=pmcentrez\&rendertype=abstract. AJL+10 Au, K. F., Jiang, H., Lin, L., Xing, Y. and Wong, W. H.,
Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Research, 38,14(2010), pages 4570– 8. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi? artid=2919714\&tool=pmcentrez\&rendertype=abstract.
ABTA12 American brain tumor association. URL http://www.abta.org/news/ brain-tumor-fact-sheets/. Visited on 24-10-2012.
ARH12 Anders, S., Reyes, A. and Huber, W., Detecting differential usage of exons from RNA-seq data. Genome Research, 22,10(2012), pages 2008– 17. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=3460195\&tool=pmcentrez\&rendertype=abstract. Bab12 Babraham Bioinformatics, FastQC: A quality control tool for high
throughput sequence data. URL http://www.bioinformatics. babraham.ac.uk/projects/fastqc/.
BG12 Bolger, A. and Giorgi, F., Trimmomatic: A flexible read trimming tool for Illumina NGS data. URL http://www.usadellab.org/cms/index. php?page=trimmomatic.
BJN+09 Birol, I., Jackman, S. D., Nielsen, C. B., Qian, J. Q., Varhol, R., Stazyk,
G., Morin, R. D., Zhao, Y., Hirst, M., Schein, J. E., Horsman, D. E., Connors, J. M., Gascoyne, R. D., Marra, M. a. and Jones, S. J. M., De novo transcriptome assembly with ABySS. Bioinformatics (Oxford, England), 25,21(2009), pages 2872–7. URL http://www.ncbi.nlm. nih.gov/pubmed/19528083.
Bri04 Brinkman, B. M. N., Splice variants as cancer biomarkers. Clinical Biochemistry, 37,7(2004), pages 584–94. URL http://www.ncbi.nlm. nih.gov/pubmed/15234240.
BW94 Burrows, M. and Wheeler, D., A block sorting lossless data compres- sion algorithm. Technical Report 124, Digital Equipment Corpora- tion, 1994. URL http://www.hpl.hp.com/techreports/Compaq-DEC/ SRC-RR-124.html.
CLH+11 Chen, P., Lepikhova, T., Hu, Y., Monni, O. and Hau- taniemi, S., Comprehensive exon array data processing method for quantitative analysis of alternative spliced vari- ants. Nucleic Acids Research, 39,18(2011), page e123. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= 3185423\&tool=pmcentrez\&rendertype=abstract.
DAD+08 Denoeud, F., Aury, J.-M., Da Silva, C., Noel, B., Rogier, O.,
Delledonne, M., Morgante, M., Valle, G., Wincker, P., Scarpelli, C., Jaillon, O. and Artiguenave, F., Annotating genomes with massive-scale RNA sequencing. Genome Biology, 9,12(2008), page R175. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=2646279\&tool=pmcentrez\&rendertype=abstract. DOSR08 De Bona, F., Ossowski, S., Schneeberger, K. and Rätsch, G., Optimal
spliced alignments of short sequence reads. Bioinformatics (Oxford, England), 24,16(2008), pages i174–80. URL http://www.ncbi.nlm. nih.gov/pubmed/18689821.
DRA+12 Dillies, M.-a., Rau, a., Aubert, J., Hennequet-Antier, C., Jeanmou- gin, M., Servant, N., Keime, C., Marot, G., Castel, D., Estelle, J., Guernec, G., Jagla, B., Jouneau, L., Laloe, D., Le Gall, C., Scha- effer, B., Le Crom, S., Guedj, M. and Jaffrezic, F., A comprehen- sive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Briefings in Bioinformatics. URL http://bib.oxfordjournals.org/cgi/doi/10.1093/bib/bbs046. Dun12 Dunham, I. e. a., An integrated encyclopedia of DNA elements in the
human genome. Nature, 489,7414(2012), pages 57–74. URL http: //www.nature.com/doifinder/10.1038/nature11247.
EHWG98 Ewing, B., Hillier, L., Wendl, M. C. and Green, P., Base-Calling of Automated Sequencer Traces Using Phred. I. Accuracy Assessment. Genome Research, 8, pages 175–185.
Fre86 Frezal, J., Genatlas, Universite Paris Descartes. URL http://www. infobiogen.fr.
GFP+11 Grant, G. R., Farkas, M. H., Pizarro, A. D., Lahens, N. F.,
Schug, J., Brunk, B. P., Stoeckert, C. J., Hogenesch, J. B. and Pierce, E. a., Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioin- formatics (Oxford, England), 27,18(2011), pages 2518–28. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= 3167048\&tool=pmcentrez\&rendertype=abstract.
GGGT11 Garber, M., Grabherr, M. G., Guttman, M. and Trapnell, C., Com- putational methods for transcriptome annotation and quantification using RNA-seq. Nature Methods, 8,6(2011), pages 469–77. URL http://www.ncbi.nlm.nih.gov/pubmed/21623353.
GGL+10 Guttman, M., Garber, M., Levin, J. Z., Donaghey, J., Robinson, J., Adiconis, X., Fan, L., Koziol, M. J., Gnirke, A., Nusbaum, C., Rinn, J. L., Lander, E. S. and Regev, A., Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi- exonic structure of lincRNAs. Nature Biotechnology, 28,5(2010), pages 503–10. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=2868100\&tool=pmcentrez\&rendertype=abstract.
GGM+10 Griffith, M., Griffith, O. L., Mwenifumbo, J., Goya, R., Morrissy, a. S., Morin, R. D., Corbett, R., Tang, M. J., Hou, Y.-c., Pugh, T. J., Robert- son, G., Chittaranjan, S., Ally, A., Asano, J. K., Chan, S. Y., Li, H. I., Mcdonald, H., Teague, K., Zhao, Y., Zeng, T., Delaney, A., Hirst, M., Morin, G. B., Jones, S. J. M., Tai, I. T. and Marra, M. A., Alternative expression analysis by RNA sequencing. Nature Meth- ods, 7,10(2010), pages 843–847. URL http://www.ncbi.nlm.nih.gov/ pubmed/20835245.
GHY+11 Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson,
D. a., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N. and Regev, A., Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology, 29,7(2011), pages 644–52. URL http://www.ncbi.nlm.nih.gov/pubmed/21572440. HGNL12 Hu, J., Ge, H., Newman, M. and Liu, K., OSA: a fast and accu-
rate alignment tool for RNA-Seq. Bioinformatics (Oxford, England), 28,14(2012), pages 1933–4. URL http://www.ncbi.nlm.nih.gov/ pubmed/22592379.
HK10 Hardcastle, T. J. and Kelly, K. a., baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 11, page 422. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= 2928208\&tool=pmcentrez\&rendertype=abstract.
HW11 Hanahan, D. and Weinberg, R. a., Hallmarks of cancer: the next gen- eration. Cell, 144,5(2011), pages 646–74. URL http://www.ncbi.nlm. nih.gov/pubmed/21376230.
HWF00 Hanahan, D., Weinberg, R. A. and Francisco, S., The Hallmarks of Cancer Review University of California at San Francisco. Cell, 100, pages 57–70.
ISC01 International Human Genome Sequencing Consortium, Initial sequenc- ing and analysis of the human genome. Nature, 409,6822(2001), pages 860–921. URL http://www.nature.com/nature/journal/ v409/n6822/abs/409860a0.html.
JW08 Jiang, H. and Wong, W. H., SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformat- ics (Oxford, England), 24,20(2008), pages 2395–6. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= 2562015\&tool=pmcentrez\&rendertype=abstract.
KASR13 Kumar, V., Abbas, A., Aster, J. and Robbins, S., "Chapter 11 Hematopoietic and Lymphoid Systems." Robbins Basic Pathology. El- sevier Saunders, 2013.
Ken02 Kent, W. J., BLAT—The BLAST-Like Alignment Tool. Genome Re- search, 12,4(2002), pages 656–664. URL http://www.genome.org/ cgi/doi/10.1101/gr.229202.
KHNH11 Karinen, S., Heikkinen, T., Nevanlinna, H. and Hautaniemi, S., Data integration workflow for search of disease driving genes and genetic variants. PLOS ONE, 6,4(2011), page e18636. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=3075259\&tool=pmcentrez\&rendertype=abstract. KKH+07 Krex, D., Klink, B., Hartmann, C., von Deimling, A., Pietsch, T.,
Simon, M., Sabel, M., Steinbach, J. P., Heese, O., Reifenberger, G., Weller, M. and Schackert, G., Long-term survival with glioblastoma multiforme. Brain : a journal of neurology, 130,Pt 10(2007), pages 2596–606. URL http://www.ncbi.nlm.nih.gov/pubmed/17785346. KWAB10 Katz, Y., Wang, E. T., Airoldi, E. M. and Burge, C. B.,
Analysis and design of RNA sequencing experiments for identify- ing isoform regulation. Nature Methods, 7,12(2010), pages 1009– 15. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=3037023\&tool=pmcentrez\&rendertype=abstract. KXK+09 Kortylewski, M., Xin, H., Kujawski, M., Lee, H., Liu, Y.,
Harris, T., Drake, C., Pardoll, D. and Yu, H., Regulation of the IL-23 and IL-12 balance by Stat3 signaling in the tu- mor microenvironment. Cancer Cell, 15,2(2009), pages 114– 23. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=2673504\&tool=pmcentrez\&rendertype=abstract.
KXOW07 Kapur, K., Xing, Y., Ouyang, Z. and Wong, W. H., Exon ar- rays provide accurate assessments of gene expression. Genome Biology, 8,5(2007), page R82. URL http://www.pubmedcentral. nih.gov/articlerender.fcgi?artid=1929160\&tool=pmcentrez\ &rendertype=abstract.
LBN+12 Lohse, M., Bolger, A. M., Nagel, A., Fernie, A. R., Lunn,
J. E., Stitt, M. and Usadel, B., RobiNA: a user-friendly, inte- grated software solution for RNA-Seq-based transcriptomics. Nu- cleic Acids Research, 40,Web Server issue(2012), pages W622– 7. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi? artid=3394330\&tool=pmcentrez\&rendertype=abstract.
LD09 Li, H. and Durbin, R., Fast and accurate short read align- ment with Burrows-Wheeler transform. Bioinformatics (Ox- ford, England), 25,14(2009), pages 1754–60. URL http: //www.pubmedcentral.nih.gov/articlerender.fcgi?artid=
2705234\&tool=pmcentrez\&rendertype=abstract.
Lei12 Leipzig, J., When can we expect the last damn microarray paper? Blogspot, January 18, 2012. URL http://jermdemo.blogspot.be/ 2012/01/when-can-we-expect-last-damn-microarray.html. Vis- ited on 3-October-2012.
LFJ11 Li, W., Feng, J. and Jiang, T., IsoLasso: a LASSO regression approach to RNA-Seq based transcriptome assembly. Journal of Computational Biology., 18,11(2011), pages 1693–707. URL http://www.ncbi.nlm. nih.gov/pubmed/21951053.
LGM+11 Lunter, G., Goodson, M., Meader, S., Hillier, L. W. and Locke, D.,
Stampy: a statistical algorithm for sensitive and fast mapping of Il- lumina sequence reads. Genome Research, 21,6(2011), pages 936– 9. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi? artid=3106326\&tool=pmcentrez\&rendertype=abstract.
Lin12 Lindgreen, S., AdapterRemoval: Easy Cleaning of Next Generation Sequencing Reads. BMC Research Notes, 5,1(2012), page 337. URL http://www.ncbi.nlm.nih.gov/pubmed/22748135.
LRD08 Li, H., Ruan, J. and Durbin, R., Mapping short DNA se- quencing reads and calling variants using mapping quality scores. Genome Research, 18,11(2008), pages 1851–8. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= 2577856\&tool=pmcentrez\&rendertype=abstract.
LTPS09 Langmead, B., Trapnell, C., Pop, M. and Salzberg, S. L., Ul- trafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 10,3(2009), page R25. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=2690996\&tool=pmcentrez\&rendertype=abstract. LZW+06 Langowski, J. L., Zhang, X., Wu, L., Mattson, J. D., Chen, T., Smith,
K., Basham, B., McClanahan, T., Kastelein, R. a. and Oft, M., IL-23 promotes tumour incidence and growth. Nature, 442,7101(2006), pages 461–5. URL http://www.ncbi.nlm.nih.gov/pubmed/16688182. LZW+11 Lan, F., Zhang, L., Wu, J., Zhang, J., Zhang, S., Li, K., Qi, Y.
and Lin, P., IL-23/IL-23R: potential mediator of intestinal tumor pro- gression from adenomatous polyps to colorectal carcinoma. Interna- tional Journal of Colorectal Disease, 26,12(2011), pages 1511–8. URL http://www.ncbi.nlm.nih.gov/pubmed/21547355.
Mar08 Mardis, E. R., Next-generation DNA sequencing methods. Annual Review of Genomics and Human Genetics, 9, pages 387–402. URL http://www.ncbi.nlm.nih.gov/pubmed/18576944.
Met10 Metzker, M. L., Sequencing technologies - the next generation. Nature