• No se han encontrado resultados

III. MARCO DE REFERENCIA: EJES TEMÁTICOS DE LA SISTEMATIZACIÓN

3.2 Comunicación participativa

3.2.1 Apuntes sobre desarrollo y comunicación

Nano-MeDIP-seq was developed and used to analyse ageing LT-HSC methylomes in this study. Based on multiple metrics, the data generated is of remarkably high quality (Chapter 3). However, Nano-MeDIP-seq was not compared directly to the original MeDIP method. It is therefore not currently possible to identify the drawbacks of performing MeDIP-seq with lower starting DNA concentrations. It is important that these limitations are known, in order to make informed decisions regarding suitable methylome analysis methods, when designing

151 future experiments. For example, the CpG enrichment scores observed in the data generated using Nano-MeDIP-seq appeared slightly lower, when compared to existing data from other studies within the ‗Medical Genomics‘ group. A lower CpG enrichment score could be due to various reasons such as: 1) lower global methylation in HSCs compared to these other samples, which are typically cancer cells or tissues; 2) lower global methylation in the C57BL/6 mice compared to human samples, which are typically studied within the Medical Genomics group; 3) artificially lower CpG enrichment in Nano-MeDIP-seq data compared with the original MeDIP-seq method. In order to rule out the third scenario, it will be necessary for future studies to perform MeDIP-seq experiments with both methods, using the same samples. This will allow the direct comparison of methylomes generated by these methods, to determine any significant differences that may exist between them. Nevertheless, CpG coverage for all Nano-MeDIP-seq samples was comparable to that of MeDIP-seq [133], suggesting that Nano-MeDIP-seq is as efficient in detecting DNAm as the original MeDIP-seq method.

Third generation sequencing technologies such as those by Helicos Bioscience [204], Oxford Nanopore Technologies [205] and Pacific Bioscience [206], are capable of single molecule sequencing and are thought to be able to distinguish between cytosine and methyl- cytosine [205,207]. The advantages of these technologies are immediately apparent. Theoretically, only single molecules are required to generate an entire methylome. This eliminates the need for PCR, and removes the amplification bias that comes with it. Additionally the low input requirement of single molecule sequencing technologies allows the study of rare samples, without further optimizations. Although the Nano-MeDIP-seq method developed in this thesis was a significant breakthrough for unbiased methylome analysis, and makes the MeDIP-seq method available to studies where samples of interest are limiting, it is expected that third generation sequencing techniques will replace cumbersome bisulfite and enrichment based methods.

152

7.3 MeDIP-seq Data Analysis

The remarkable advancement in high throughput whole genome analysis technologies, which allow the generation of high resolution methylomes in a relatively short period of time, presents a significant challenge in terms of data analysis. Significant bioinformatics and statistical effort is required to analyse and interpret the enormous data set that is typically generated for methylome or transcriptome studies. As multiple tools and algorithms are required for data processing steps such as sequence alignment, filtering of clonal reads, data normalisation, read counts and differential read counts, it is often necessary to streamline the data analysis processes into discreet computational pipelines. Unfortunately, these packages are typically modelled around studies of gross abnormalities. This means that thresholds are often set too high to detect subtle deviations from the norm, such as those observed during HSC ageing. The MeDUSA pipeline [134] was used to generate HSC methylomes and to determine aDMRs in this study. Initial analysis with this pipeline yielded no differentially methylated regions, despite several obvious differences in sequence read counts between ageing samples. As a result, the threshold for DMR calling was lowered and rules within the pipeline were modified to allow the detection of subtle but significant changes during HSC ageing. This optimization yielded more DMRs, but also increased the incidence of false positive results (visual confirmation on UCSC genome browser). Subsequently, a more conservative version of the DESeq Bioconductor package (v2.10.2) was released and utilised. This version provided the highest level of true DMRs and the lowest incidence of false positives, as determined by visual confirmation of aDMRs on the UCSC genome browser. Nonetheless, the amount of aDMRs that were detected in this thesis remained relatively few, compared to most methylome analysis studies and this low amount of aDMRs appears at odds with the obvious decrease in global DNAm that was observed in aged HSCs. While the former could be attributed to the fact that a homogenous population of FACS purified cells was studied in this thesis, it is possible that the 5% difference that was observed in global DNAm is too small to be detected in discreet regions,

153 and several true aDMRs were missed, as a result of the low sensitivity of the available analysis tools. The MeDUSA pipeline and other similar analysis tools are constantly being further developed; it will be interesting to investigate the effects of these improvements on the data generated in this thesis.

The MeDIP-seq method can be used to detect other forms of DNAm, such as 5- hydroxymethylcytosine (5-HmC), however, this requires the use of 5-HmC specific antibodies and thus must be assayed separately to 5mC MeDIP reactions. Due to limited time and resources, this mark was not studied here; however, 5-HmC is an important feature of mammalian embryonic [67,68] and somatic stem cells [66]. The role of 5HmC DNAm in HSCs is currently unclear; however Tet2, a member of the Tet protein family, which catalyse the conversion of 5mC to 5HmC in vivo, has been shown to be important for haematopoiesis. Tet2 is highly expressed in bone marrow haematopoietic cells [208] and mutations in this gene are associated with various myeloproliferative malignancies [208]. Furthermore, Tet2 expression was upregulated during all-trans retinoic acid-induced granulocytic differentiation of the promyelocytic cell line, NB4 [208], suggesting that Tet2 is important in myelopoiesis and could therefore be involved HSC regulation. Future studies should be performed to ascertain the occurrence and relevance of this mark in HSCs.

Non-CG methylation is also thought to be an important feature of mammalian stem cells and the MeDIP-seq data generated in this thesis is able to detect these forms of 5mC. This data should be further analysed to determine the presence and involvement of non-CG methylated in HSC regulation and ageing.