As we have discussed, mitochondrial levels regulate many steps of the gene expression cycle. This means that its role goes beyond that of a global modulator, instead, there are many potential effects that could modulate transcript expression individually. Our RNA-seq data allows us to quantify the expression in “high” and “low” mitochondria conditions on a transcript-by-transcript basis, so we investigated the effect of mitochon- dria on their relative abundances in HeLa, Jurkat and MRC-5 cells.
First, we noticed significant variation at the gene and transcript levels as evidenced in figure 2.4A. We found a fraction of genes (15%-20% depending on the strain) for which at least one transcript had a fold-change significantly different from the gene’s fold- change (meaning
log2FCt− log2FCg
> 1, with FCtand FCgthe transcript and gene’s fold-changes respectively). This means that, for many genes, mitochondria does not just scale the overall expression but it also changes the relative transcript abundances: regardless of whether the total expression of a gene scales according to the expected global trend, each of its transcripts individually can behave in a unique way. For some genes, the expression pattern remains unaltered and all transcripts maintain their relative abundances in the “high” and “low” conditions (fig. 2.4B). In other cases, one or more transcripts does not obey the global scaling (fig. 2.4C).
To investigate the mechanisms of this non-linear scaling, we studied each gene’s transcript expression pattern (TEP). We classified a transcript as non-linearly scal- ing if it satisfied (a) adjusted p-value < 0.1 and (b) fold-change such that
log2FCt− log2FCglobal
> 1 between the “high” and “low” conditions. FCglobalrefers to the global scaling and is given by the independent poly(T) RNA FISH experiment as the inverse of the factor used to correct TPM values in the “low” condition.
Transcription start site
One of the determinants of TEP variation could be transcription start site (TSS) choice:169 we have discussed that chromatin organization is highly ATP-dependent, which can lead to mitochondrial content modulating the availability of different poly- merase binding sites.
To test this hypothesis, we selected those genes with at least one linearly and one non-linearly scaling transcript. We then looped through those genes, randomly selecting two transcripts per each (one linearly and one non-linearly scaling) and comparing their
A
-4 -2 0 2 4 6 log2 FCgene -4 -2 0 2 4 6 log 2 FC transcript r = 0.43 100 102 TPM EEF1B2B
low mito. high mito. 100 101 TPM MBPC
D
HeLa Jurkat MRC-5 0 20 40 60 80 % of shared TSS Opposite regulation ControlFig. 2.4. Non-linear mitochondrial regulation of gene expression. A. Fold-change of the expression levels (“high”/“low”) of genes (x axis) and their corresponding transcripts (y axis)
in HeLa cells (Pearson’s correlation coefficient r = 0.43). The shaded area corresponds to
| x − y | ≤ 1. Roughly 82% of the genes have all their transcripts into the shaded region, meaning there is an prominent fraction (∼18%) of genes for which at least one transcript with a FC that differs significantly from the gene’s FC. B-C. Expression levels in the “low” (purple) and “high” (orange) mitochondria subpopulations are represented for the transcripts of two genes (EEF1B2 and MBP) in HeLa cells. Each dot is a transcript. For the first gene, expression increases with mitochondrial mass and the relative abundances of the transcripts are maintained. One of the tran- scripts of the second gene (indicated by the black arrow), on the other hand, shows an inversion: instead of following the global scaling trend, its expression is decreased in the “high” condition. Error bars are standard deviations across replicates. D. Fraction of shared transcription start sites across transcripts of the same gene with different regulation (gray) and for a null ensemble of transcripts chosen arbitrarily (white). Error bars represent standard deviations computed by boot- strapping.
TSS. We considered that the TSS were the same if they fell within 100bp of each other and vice-versa. As a control, we repeated this process but this time looping through all genes and selecting transcripts randomly, creating a “null” ensemble of TSS distances. We did this 20 times per strain for statistically meaningful results.
We found that for all three strains, roughly 50% of the transcripts of the same gene share their TSS, but this percentage goes down to ∼30% when comparing transcripts with linear and non-linear scaling (fig. 2.4C). This shows that indeed TSS choice is, to a large extent, determining the mitochondrial control of TEPs. However, the fact that there is still a fraction of transcripts with different scaling that share the same TSS (or at least their TSS lie very close to each other) means that there must be additional layers of regulation.
Secondary structure... and transcript length?
One mechanism that has been proposed for mitochondrial control of aternative splic- ing is the variation of elongation speed leading to different precursor mRNA secondary structures69, 72(fig. 2.5A-B). Secondary structure has been long regarded as a key deter- minant of splicing form choice.170, 171
Addressing whether such a mechanism is plausible requires the systematic computa- tion of pre-mRNAs secondary structures. There are many tools that aim to predict RNA folding based on nucleotide sequence.172 In general, they work under the premise that the spatial configuration of a RNA molecule should minimize the free energy, although newer methods with different approaches and increased accuracy also exist.173Formally speaking, genes coding for pre-mRNAs with at least two possible secondary structures with similar free energies should be more prone to regulation through elongation speed variation. If the changes in the kinetic energy of elongating polymerases induced by mitochondrial variability are comparable to the differences in free energy between alter- native pre-mRNA folding configurations, the mechanism would be consistent.
Yet addressing this question on a genome-wide scale would require an immense amount of computational resources. Additionally, the tools for secondary structure pre- diction generally have limited precision. A more convenient way to study this matter is by looking at transcript lengths. The number of possible secondary structures that a RNA can adopt grows exponentially with its length,174, 175 so we expect longer pre-mRNAs to have a higher chance of undergoing conformational changes induced by elongation speed variation.
We selected two subgroups of protein-coding genes with lengths within the top or bottom 10% of the complete set (fig. 2.5C). We then compared the fraction of genes within each subgroup that showed alterations in their transcript relative abundances (i.e. altered TEP) between the “high” and “low” mitochondria conditions (fig. 2.5D). We found no variation across subgroups in HeLa and Jurkat cells. This does not imply that there is no role of secondary structure in these strains but rather that, if there is, it is not
A
Low mito.B
High mito.C
103 104 105 106 Gene length (bp) Frequency short genes long genesD
HeLa Jurkat MRC-5 0 10 20 30 % of geneswith altered TEP
Short genes Long genes
Fig. 2.5. Effect of gene length on mitochondrial modulation of expression. A-B. Mitochondrial
content modulates elongation speed of polymerases (gray) through DNA strands (red), which is a determinant of mRNA (green) seconday structure. C. Distribution of expressed protein-coding gene lengths in HeLa. 10% shortest and longest genes are indicated in dark and light red respec- tively. D. Among the shortest (dark red) and longest (light red) genes, similar fractions have their TEPs altered by mitochondria in HeLa and Jurkat. Strikingly, in MRC-5 there is a higher fraction of genes with altered TEPs among the shortest ones, as opposed to our expectation based on the secondary structure hypothesis. Error bars were computed by bootstrapping.
prominent enough to be noticeable by just looking at precursor RNA lengths.
Surprisingly, MRC-5 cells behaved differently: non-linear mitochondrial regulation of transcript abundances was more prominent in genes with shorter lengths. It is likely that our assumption of pre-mRNA size being connected to secondary structure was too generic in this case: many other layers of energy-dependent regulation could be taking place, such as expression cost increasing with length, and neglecting them may be too rough of an approximation. In any case, the sole finding of the relationship between pre-mRNA length and mitochondrial regulation being strain-specific is remarkable and hints towards a mechanism that is less general than elongation speed variation. Further experimental work (e.g. including more cell lines or measuring the relative abundances of well-characterized, alternatively folded forms of specific pre-mRNAs) is required to characterize the effect, if any, of elongation speed on secondary structure.