My next objective was to explore whether VSMCs from the AA and DT regions showed heterogeneity within each vascular bed and if so, which genes were highly variable. Variability in observed gene expression levels between cells from the same vascular bed is due to both technical and biological effects. Low amount of starting material is one of the main technical factors contributing to the resulting cell-cell variability (Brennecke et al. 2013). Such technical variability is dependent on the mean expression level of a gene and decreases for highly
58
expressed genes as higher transcript levels are captured more consistently across cells and dropout effects become less dominant. I was interested in the genes, which showed variability in expression levels beyond the technical noise, given their mean expression level.
To assess which genes were expressed heterogeneously, I adopted an approach for identifying highly variable genes (Lun et al. 2016) based on the relationship between variance and mean log expression of profiled genes. To estimate the background technical variation I fitted a parametric trend to the variance of log expression levels versus mean log expression levels of all genes. The assumption of this method was that a large majority of genes were not variably expressed between the cells, therefore the minority of genes that were expressed heterogeneously should not skew the background estimation. Technical background variation was then subtracted from the total variance of each gene. A gene was identified as highly variable if the remaining biological component of variance was significantly greater than zero (adjusted p-value < 0.05, details in Methods, Lun et al. 2016). This resulted in 176 genes being identified as highly variable in the AA region and 120 in the DT region (Figure 3.14), with 65 of them common to both regions. Interestingly, only two of the genes identified as highly variable in at least one region (Wif1 and Lum) were also identified as differentially expressed between AA and DT (Section 3.2.2), suggesting that heterogeneity within each vascular region is largely driven by factors independent of regional identity.
Figure 3.14: Highly variable genes in AA and DT regions.
Scatterplots showing the variance of log2-transformed normalised read counts versus mean log2- transformed normalised read counts, with each black dot representing a gene. Genes identified as highly variable in the AA (red, left) and the DT (yellow, right) are indicated. Blue line represents the estimation of technical variance. The figure is from Dobnikar and Taylor et al. 2018.
59
To assess the nature of the genes, which were identified as highly variable, I used gene ontology analysis. Highly variable genes within the AA region showed enrichment for “positive regulation of vasculature development” (genes Pdgfd, Hspb6, Nras, Hk2, Efnb2, Adm, Anxa1, F3, Rapgef2, Myocd) and “regulation of cell growth” (genes Cd44, Thrb, Fn1, Rab11a, Rtn4, Ddx3x, Tram1, Rbbp7, Gja1, Sdcbp, Sgk1, Myocd, Efna5, Rgs4) among enriched gene ontology terms (Figure 3.15). Highly variable genes within the DT region showed enrichment for gene ontology term “regulation of cell population proliferation” (genes Cd44, Thrb, Itga1, Stat1, Tes, Wisp2, Pdgfd, Atf3, Rtn4, Anxa2, Dsp, Gpx1, Nfkbia, Fth1, Rgs5, Efnb2, Nupr1, Ctnnb1, Asph, Cdh13, Anxa1, Gja1, Cnbp, Hipk1, Rps6kb1). In addition, a number of genes identified as highly variable were involved in processes which play an important role in VSMC biology in disease, such as proliferation and migration. While gene ontology terms “regulation of cell proliferation” and “regulation of cell migration” were not significantly enriched among the highly variable genes in the AA or DT regions, a number of highly variable genes were associated with these gene ontology terms. Forty of the genes identified as highly variable were associated with the gene ontology term “regulation of cell proliferation”, including Rgs5, Gja1, Pdgfd, Irf1, Anxa1, Anxa2, Myocd, Fn1 and Nfkbia. 21 of the highly variable genes in either region mapped to the go term “regulation of cell migration”, including Pdgfd, Anxa1, Anxa3, Myocd, Fn1, Adamts1, Gja1 and Postn among others. Expression profiles for some of the highly variable genes are shown in Figure 3.16.
60
Figure 3.15: Gene ontology analysis of highly variable genes in the AA and DT regions.
Enrichment for gene ontology terms (biological process) among genes identified as highly variable in the AA (a) and DT (b) regions. Bar plots show the significantly enriched gene ontology terms, ranked by their adjusted p-values (details in Methods).
Several of the genes identified as highly variable within the AA or DT regions were previously investigated in the context of cardiovascular disease. For example, Anxa1 has been observed to be expressed at lower levels in VSMCs isolated from asymptomatic compared with symptomatic human plaques (Viiri et al. 2013). Pdgfd is an activator of the PDGF receptor β and overexpression of Pdgfd in transgenic mice was observed to lead to increased VSMC proliferation and vascular remodelling (Pontén et al. 2005). Rgs5 was observed to promote atherosclerotic plaque formation and its expression was found to be increased during arterial remodelling (Arnold et al. 2014). Rgs5 is also involved in the regulation of blood pressure and VSMC contraction (Gunaje et al. 2011). Overall, this analysis shows that VSMCs from the same vascular region show heterogeneous expression of a range of genes related to biological functions, which are important for VSMC biology in healthy arteries as well as in disease.
61
Figure 3.16: Expression levels of highly variable genes in profiled VSMCs.
Violin plots showing the distribution of log2-transformed normalised read counts of selected highly variable genes across the cells from the AA (red) and the DT (yellow). The median is marked with a grey line.