• No se han encontrado resultados

PARTE II: LAS PERCEPCIONES DE LA COMUNIDAD ESCOLAR SOBRE EL USO DE LAS TECNOLOGÍAS DE INFORMACIÓN Y

8. TECNOLOGÍA, COMUNICACIÓN Y EQUIDAD.

Here, I annotated cancer-related fusions that belong into one of 3 categories: 1) known oncogenic fusions that have previously been causally implicated in cancer initiation and development, 2) fusions that have previously been observed in a subset of patient samples, and 3) fusions that involve known cancer-driver genes.

Fusions that have previously been causally implicated in cancer development and progression are important and useful tools in my study of functional gene fusions. As their oncogenic function is pre-established, they serve as valuable positive controls and benchmarks in my downstream analyses. To annotate known oncogenic fusions, I used the manually and expert-curated COSMIC gene fusion database (https://cancer.sanger.ac.uk/cosmic/fusion). The version downloaded in May 2018 contains 295 fusions.

Similarly, annotating fusions that have been observed in patient samples can help to put fusions that are found in cancer cell lines into context, as only fusions that actually occur in the patient population will be relevant for therapeutic applications. It also gives us an indication on the proportion of fusions observed in patients that we can functionally assess in our cancer cell lines. For this purpose, I use the recent paper published by Gao and colleagues, which analysed 9,624 TCGA tumour samples and found 25,664 gene

77 fusions (Gao et al., 2018). Fusions are considered as “observed in patients”, if the same gene partners are fused in the same position (5’/3’) in any of the samples analysed there. Finally, fusions that involve known cancer driver genes may activate oncogenic functionality. For example, the fusion of a looped-coil domain to the kinase domain of ALK leads to constitutive kinase activation. Similarly, fusion genes involving known tumour suppressors can indicate knock-out of tumour-suppressor function. To annotate known cancer genes, I used the COSMIC Cancer Gene Census database, an expert-curated database that documents genes in which alterations have been causally implicated in cancer (https://cancer.sanger.ac.uk/census). My version of the database was downloaded in August 2017 and contains 609 genes. Of those, 91 were further annotated to be tumour suppressor genes, 116 were annotated as oncogenes and 37 genes have dual roles.

Of the 8,354 fusion events, 14.2% (n = 1,184) contained a cancer gene, 1.2% (n = 98) are known oncogenic fusions and 5.6% (n = 471) were previously detected in patients.

Known oncogenic fusions listed in the COSMIC fusion census show a tendency to have a higher recurrence than the rest, with the three top-most recurrent fusion in the

Figure 3.5: Recurrence of known oncogenic fusions in (A) cell lines and (B) patient samples (data from Gao et al, 2018).

78

TCGA data set being TMPRSS2-ERG (n = 191), FGFR3-TACC3 (n = 33) and CCDC6-RET (n = 24). Similarly, EWSR1-FLI1 is with 24 cell lines one of the most commonly fused fusions (Figure 3.5A-B).

When looking at the top 40 most recurrent fusions overall, I find that within cell lines, there is less of an enrichment for COSMIC fusions and fusions that involve COSMIC cancer genes than in the patient samples (Figure 3.6A-B). This is likely because the authors removed any fusion that recurs in 10 or more samples unless they were reported in previous TCGA studies (Gao et al., 2018). Although the number of fusions that were removed with this filter is not reported in the manuscript, considering that only 40 fusions recurred in 5-9 different samples (Figure 3.4B), it is likely to be small.

In cell lines, two of the top three recurrent fusions, TSHZ2-SLC35A1 and NCOR2- UBC, have been described before, but never characterised (Obholzer et al., 2015; Roberts et al., 2012). Some of the top recurrent fusions contain entirely uncharacterised genes (e.g. CTD-2334D19.1 and RP11-120D5.1) and long intergenic non-protein coding RNA (e.g. LINC01340). These fusions tend not to be tissue-type specific and are often not in-frame, which suggests that they are less likely to perform oncogenic functionality.

It is notable that of the COSMIC fusions found in our cell lines, only a subset (28%) are observed in Gao’s patient samples. Well known oncogenic fusions that are not observed in Gao’s patient samples include EWSR1-FLI1, NPM1-ALK and BRD4-NUTM1. The

Figure 3.6: Top 40 most recurrent fusions in (A) cell lines and (B) patient samples (data from Gao et al, 2018). Red bars indicate that the fusion is listed in the COSMIC fusion data base, dark yellow bars indicate that one of the genes is listed in the COSMIC cancer gene census.

79 opposite is also true, e.g. no cell lines show ETV6-NTRK3 fusions, even though those are among the most recurrent in the patient samples. Furthermore, a large number of well- understood driver events also occur at only very low frequencies (e.g. KMT2A-MLLT3 only being in a single patient sample and PML-RARA only occurring in a single cell line). This likely reflects the biases of tissue selection within both data sets, e.g. there being no Ewing’s sarcoma samples analysed in the TCGA data set. It also supports the argument that fusions are exceedingly rare events, and that even with almost 10,000 samples sequenced, we are still far off from being able to provide a comprehensive landscape of oncogenic fusion transcripts. Similarly, it demonstrates that while recurrence is now commonly used as a predictor for the oncogenic functionality of single-nucleotide variations, with the current resources available, fusions recurrence alone is currently insufficient in assigning functionality.