• No se han encontrado resultados

- Algunos “antecedentes”… Sistema Penitenciario en la Historia del Uruguay:

II. CAPÍTULO I: CÁRCEL EN LA SOCIEDAD Y EN EL INDIVIDUO

II.I - Algunos “antecedentes”… Sistema Penitenciario en la Historia del Uruguay:

As noted above and in the Introduction, retrocopy transcripts have regulatory potential based on sequence similarity with their parent gene. The specific mech-anism depends on whether the retrocopy RNA is complementary to the parent or not. If it is complementary, the retrocopy and parent RNAs can form RNA:RNA duplexes [172]; if not, the retrocopy RNA can act as an miRNA sponge, for exam-ple [168, 169]. Establishing a preference for one option or the other would rule out

one set of mechanisms in favour of the other.

The relative strandedness of the retrocopy RNA can be used to establish whether it is complementary to its parent transcript. When considering tran-scribed retrocopies and their relationship with their parent transcripts, there are three levels of strandedness (Figure 6.5):

• The strand of the parent transcript, i.e., the strand from which the parent transcript is transcribed

• The strand of the retrocopy annotation, which reflects the orientation of the insertion

• The strand from which the retrocopy is transcribed

There are therefore eight possible combinations of strands, which can be divided into two groups: those that produce retrocopy RNA (rcRNA) complementary to their parent mRNA, and those that do not (Figure 6.5).

Each retrocopy transcript is associated with a retrocopy and a parent, and so I assigned each of the expressed retrocopies to one of the eight strand combinations.

This showed that the vast majority of expressed retrocopies are transcribed in such a way as to produce rcRNA complementary to their parent mRNA (Table 6.4).

To assess the statistical significance of this result I performed a chi-squared test comparing the observed number in each category to the expected number in each category. To obtain the expected number I counted the number of retrocopies falling into each of the four parent/retrocopy strand combinations. I used the pro-portions of observed retrocopy transcript strand to calculate the expected num-ber in each of the eight parent/retrocopy/transcript categories. This calculation

Figure 6.5: The possible combinations of parent transcript strand, retrocopy strand, and transcript strand. Blue lettered boxes represent exons. The addition of ’ represents the reverse complement. (i) The original transcript in the genome.

(ii) The parent mRNA. (iii) The retrocopy insertion and its possible transcription.

Sense with respect to (wrt) the parent produces RNA equivalent to the parent.

Antisense wrt the parent produces RNA complementary to the parent.

showed an essentially even distribution across all eight categories. The chi-squared test indicated that there is a significant bias towards combinations that produce antisense RNA. There is no bias towards any particular strand combination.

This suggests that any functional role played by the expressed retrocopies will tend to be based on RNA complementarity, rather than exact sequence identity.

Parent Retrocopy Transcript Sense wrt

Observed total antisense = 988 Observed total sense = 70 Expected total antisense = 529.70 Expected total sense = 528.28

χ2= 809.67 p =1.52 × 10−170

Table 6.4: The number of expressed retrocopies across all BLUEPRINT BL6 sam-ples falling into each strand combination. A chi-squared test shows that there is a very clear enrichment in the categories leading to rcRNA complementary to the parent (highlighted in blue).

However, any such role would rely not just on complementarity from a strand perspective, but also on high sequence identity between the parent transcript DNA and the retrocopy DNA.

6.4 Expressed Retrocopies Have Higher Sequence Identity with Their Parents

The previous section demonstrates that the expressed retrocopies are much more likely to form RNA complementary to that of their parent transcript. If retrocopy expression is regulating parent transcripts, this bias rules out certain regulatory

mechanisms (e.g., miRNA sponge) in favour of others (e.g., formation of RNA:RNA duplexes). However, all of these mechanisms rely on a high level of sequence identity between the retrocopy and the parent. If the retrocopy has degraded over time, the sequence identity may be low, in which case it would not be able to fulfill a regulatory role based on sequence identity.

To assess the sequence identity between parents and retrocopies, I compared each retrocopy with its parent transcript and performed a local alignment between the two (see Methods). I assigned each alignment a score based on its length and identity, and plotted the distribution of scores for three sets of retrocopy/parent pairs: expressed retrocopies, non-expressed retrocopies, and a subset of retrocopies with a randomly assigned parent.

Figure 6.6 shows the results of this analysis. The distribution of alignment scores across all retrocopies shows a multimodal distribution, which could reflect the distribution of ages across the retrocopies. In this case, the leftmost peak would be the oldest retrocopies, which have decayed to the point that the alignment with the parent is no better than the alignment between the retrocopy and a random reference gene. In this case, the reliability of assigning a parent to a retrocopy seems dubious, if the alignment is no better than random; however, the score used here is a summary, and information is lost that could be used to identify a parent.

This leftmost peak could represent a burst of retrocopy activity at a particular time, or a plateau of decay reached by most retrocopies eventually. The peaks to the right of the histogram could also be more recent bursts of retrocopy formation, as reported in [154, 251].

Overall, the expressed retrocopies tend to have higher levels of sequence identity with their parents compared to non-expressed retrocopies. It is therefore possible

Figure 6.6: Retrocopy/parent alignment scores, where 1.0 represents a perfect full-length alignment (see Methods). ALL: All retrocopies. EXPR: All expressed retrocopies. RANDOM ALN: Negative control where retrocopies are aligned to randomly chosen parent transcripts. Expressed retrocopies are clearly biased to-wards high scores compared to all retrocopies.

that the majority of the retrocopy transcripts identified here could interact with their parent transcripts based on shared sequence identity. Given the previously

described bias towards complementary rcRNA, I expected to observe a higher sequence identity among retrocopies producing complementary RNA. However, the number of retrocopies producing non-complementary RNA is too small to be reasonably compared with any other category (see Online Resources).

While this finding does not guarantee that expressed rcRNA regulates parent transcripts via a sequence-based mechanism, it allows for the possibility that such a mechanism exists. It should be noted that more recent retrocopies are expected to have a higher sequence identity with their parent, as they will have suffered fewer mutations. Indeed, methods similar to the alignment score used here are used to estimate the age of retrocopies [141]. Assuming that these retrocopy transcripts are functional, it may be that expressed retrocopies have undergone selection to preserve sequence identity with their parent, which is important for said function.

Alternatively, it may be that expressed retrocopies are also younger retrocopies, and remain expressed until the sequence similarity to the parent has decreased to the point when it is no longer effective as a regulatory RNA. If this is the case, then how is their expression regulated? As described by Carelli et al., expressed retrocopies either use pre-existing promoters, or evolve one de novo; the former case accounts for a small percentage of retrocopies, and the latter case requires time for such a promoter to evolve.

6.5 Retrocopy Transcription and