• No se han encontrado resultados

e.iii.c) Pasta de Hoshino o 3 Mix

III. TERAPIA PULPAR

III.2. e.iii.c) Pasta de Hoshino o 3 Mix

Given the breadth of biosamples in the union of ENCODE and Roadmap data, we aspired to build an initial Registry covering a majority of cREs in the genome. The most direct approach to identifying cREs would be to include all relevant epigenetic signals in a comprehensive statistical model and then train the model with experimentally validated regulatory elements. Indeed, such methods have been developed (Rajagopal et al. 2013; Erwin et al. 2014). However, at this time, relatively few enhancers and insulators have been systematically tested across many cell environments with functional assays: without such a “gold standard,” it is not possible to train a general statistical model that remains predictive in new cell types.

Therefore, we pursued a different approach that is based on just four epigenetic signals that we found to be most predictive of regulatory elements: chromatin

accessibility (measured by DNase-seq), the histone modifications H3K4me3 and

work in the field. DNase hypersensitive sites delineate all the main classes of cis- regulatory elements in a cell-type-specific manner, including promoters, enhancers, insulators, and locus control regions (Thurman et al. 2012). H3K4me3 and H3K27ac are the two histone marks most enriched at promoters and enhancers respectively (Heintzman et al. 2007; Visel et al. 2009). CTCF is the established insulator binding protein in

mammals (Kim et al. 2007) and its binding sites are enriched at interacting chromatin loci (Rao et al. 2014).

To further test our selection, we compared the effectiveness of ten different types of epigenetic signals in predicting enhancers in the corresponding tissue: DNase

hypersensitivity, eight histone marks (H3K4me1, H3K4me2, H3K4me3, H3K27ac, H3K9ac, H3K9me3, H3K36me3, and H3K27me3), and DNA methylation. These epigenetic signals were all assayed with specific mouse e11.5 tissues during ENCODE III, and the tissue-specific e11.5 enhancers tested using in vivo transgenic assays were from obtained the VISTA database (Visel et al. 2007). We found that DNase and

H3K27ac were the best single features for predicting tissue-specific enhancers. We then used RNA-seq to evaluate the effectiveness of these same epigenetic signals in predicting gene expression levels and found H3K4me3 to the best single feature. We found that DNase offers high spatial precision in defining cREs: DHSs are ~350 bp long and typically correspond to the core of regulatory elements. In contrast, the H3K27ac and H3K4me3 signals are more diffuse: they tend to be low at the center of a regulatory element—presumably because of the lack of a nucleosome there—but are elevated at flanking nucleosome positions. DNase, therefore, presents the best localization of a cRE,

while H3K27ac and H3K4me3 suggest the recent activity state, and the coincidence of significant signals from at least two assay types increases the overall confidence in the cRE.

To experimentally test the enhancer branch of our predictor, we used the average rank of the DNase and H3K27ac signals to identify previously untested TSS-distal (> 2 kb from the nearest TSS) candidate enhancers in the mouse e11.5 hindbrain, midbrain, and limb. The boundaries for the predicted regions were defined using the H3K27ac ChIP-seq peaks called by the MACS2 algorithm (Zhang et al. 2008) (see Figure II-9 on page 95). For each tissue, we tested 20, 15 and 15 new regions around the ranks 1-20, 1500-1520, and 3000-3020, respectively. In total, we tested 151 regions (for results, see online Supplementary Table 4). Representative e11.5 transgenic embryos for the

enhancers that validated in the expected tissues are shown in Figure II-10 on page 96. Consistently, higher ranking regions were more likely than lower ranking regions to show enhancer activity in their predicted tissue (Figure II-11 on page 97; e.g., 75%, 26.6%, and 20% for the hindbrain). When enhancers were active in multiple tissues, these tissues also had high H3K27ac signals across the predicted enhancer regions (Figure II-12c-e on page 98). For example, a predicted enhancer in the hindbrain was also active in the midbrain and neural tube; accordingly, high H3K27ac signals were observed in all three tissues (Figure II-12d on page 98). In contrast, an enhancer active almost exclusively in the limb (Figure II-12e on page 98) did not show high H3K27ac signals in other tissues assayed. These results suggest that combining DNase and H3K27ac can identify active enhancers in a particular tissue and quantify their tissue selectivity patterns.

In aggregate, our evaluations showed that combining DNase with two histone marks, H3K4me3 and H3K27ac, is an effective way to build a first version of the

Registry of candidate promoters and enhancers active in specific cell types. We extended this predictor by adding CTCF, a highly conserved architectural protein that binds to insulators and contributes to the establishment and maintenance of three-dimensional chromatin structure (Ong and Corces 2014). Our final algorithm anchors cREs on a representative set of all DHSs, and then evaluates cRE types and activities based on H3K4me3, H3K27ac, and CTCF signals. To maximize coverage, we applied the

algorithm to all cell types interrogated by at least one of these assays, making it possible to include data from 301 human cell types (620 when primary cells or tissues from different donors are counted separately) and 58 mouse cell types (138 with

developmental time-points counted separately) with all ENCODE and Roadmap data considered. It is thus important to note that we distinguish two classes of cREs displaying no activity in a given cell type: cREs for which necessary assays are missing in the cell type, and cREs for which the necessary assays are present but the associated signals did not score as significantly positive.

The first release of the Registry presented here includes 1.31 million human cREs and 0.43 million mouse cREs; future versions will be released periodically, and are already under development. Based on the levels of the four core epigenetic signals and the distance to the nearest annotated TSS, we also classify cREs as those that have promoter-like signatures (PLS) or enhancer-like signatures (ELS) or as those that lack these signatures but are bound by the insulator-binding protein CTCF.

Documento similar