Mapping eQTLs using total expression may be performed by linear regression, in which expression levels are compared between groups of individuals who have zero, one, or two copies of the putative cis-acting allele. Mapping aeQTLs is more
complex and requires genotypes to be phased. The principles underlying this process are outlined below.
Chapter 1: Introduction
57
Once the presence of AEI has been demonstrated at a transcribed polymorphism, providing evidence of cis-acting regulation of expression, the goal is to identify the causative polymorphisms responsible for (or at least predictive of) expression
differences. In this context, AEI is characterised not only by its presence or absence, but also by which of the alleles is over-expressed and by the extent of the differential expression, quantified using the allelic expression ratio (AER). The principle behind the analysis is to compare the observed AER values with those that would be
predicted under different assumptions. The simplest case is when an association between the transcribed marker itself and AER is analysed, and predictions from two models are compared. The first model assumes that allelic expression levels are independent of which allele is present at the transcribed locus, (i.e. there is no
association between AER and genotype), in which case individual AER values would not deviate systematically from a 1:1 ratio. The alternative model is that one of the alleles is preferentially overexpressed, in which case a systematic deviation from a 1:1 ratio would be expected. Any test that compares the mean AER to a 1:1 ratio would be suitable to assess association (such as a t-test).
A more complicated situation is when trying to assess the cis-acting effect of a polymorphism that is not in the transcript. In this situation two polymorphisms need to be considered; the transcribed one and the potential cis-acting one. By
experimental design all individuals are heterozygous at the transcribed SNP. An association between genotype at the cis-acting site and expression can be detected if there are differences between AER in individuals with different genotypes at the cis- acting locus. As above, the first model assumes that AER is independent of which allele is present at the cis-acting locus, hence individual AER values would not deviate from a specific ratio (1:1 if the transcribed marker has no effect). The second model assumes that genotype at the candidate cis-acting locus influences expression. This concept is illustrated in
Chapter 1: Introduction
58
Figure 1.10. The alleles at the transcribed marker are designated ‘M’ and ‘m’, and the alleles at the putative cis-acting marker as ‘C’ and ‘c’. In this example the transcribed polymorphism has no effect and the ‘C’ allele at the cis-acting locus causes
overexpression of the transcript from the same chromosome. Alleles ‘MC’ and ‘mC’ will be expressed at the same level as each other, and overexpressed compared to ‘Mc’ and ‘mc’ (which are both expressed at the same lower level). Since only relative transcript levels at the transcribed polymorphism are compared, AEI will only be detected in individuals who are heterozygous at the cis-acting site (since genotypes ‘CC’ and ‘cc’ will have the same cis-acting effect on each allele of the transcribed polymorphism and therefore the ratio is 1:1 for both of these homozygotes). If the
cis-acting polymorphism influences expression, the AER in individuals who are
heterozygous for the cis-acting polymorphism deviates from that seen in homozygotes at the cis-acting site, as shown in
Chapter 1: Introduction
59
Figure 1.10. Since the ‘C’ allele causes overexpression, the ‘MC/mc’ phased genotype would cause relative overexpression of the ‘M’ allele, while the genotype ‘Mc/mC’ would cause the ‘m’ allele to be overexpressed. Therefore to determine the effect of the putative cis-acting polymorphism requires the phase between the cis- acting and transcribed polymorphisms to be estimated (i.e. the probability that individuals who are heterozygous at both sites have the genotypes ‘MC/mc’ or ‘Mc/mC’). Estimating phase and effect can either be done simultaneously or as separate steps. The analysis is performed as a likelihood ratio test comparing the likelihood of the observations occurring assuming no effect from cis-acting
polymorphism with the probability of the observations occurring assuming an effect of the putative cis-acting polymorphism. It assumes that the variance of the
observations is independent of the genotype, that the ratios follow a log-normal distribution, and that the mean AER effects observed are representative of the true effects. The latter assumption means that the sample size has to be large enough to accurately assess the effect of a given genotype, which restricts the number of genotypes that can be reliably analysed for a given sample size.
Chapter 1: Introduction
60
Figure 1.10. Effect of cis-acting SNPs on AER at the transcribed marker.
All individuals are heterozygous for the transcribed marker (i.e.‘M/m’). The four phase-known genotypes and the corresponding three phase-unknown genotypes are represented on the horizontal axis. The vertical diamonds represent the distribution of log AERs for each genotype, with the horizontal bar representing the mean. Cis-acting differences will be seen only in those individuals heterozygous at the cis-acting locus, hence the mean log AER is zero (corresponding to an AER of 1:1) in both ‘CC’ and ‘cc’ homozygous groups. The cis-acting effect is seen as a deviation from the 1:1 ratio in ‘Cc’ heterozygotes, but the direction and magnitude (µ1)of this effect can only be estimated
once the genotypes are phased (when it can be seen that the ‘C’ allele at the cis-acting SNP causes overexpression) . Figure adapted from Teare et al254.
Log AER (m/M ratio) 0 (m/M ratio >1) (m/M ratio <1) Log AER (m/M ratio) 0 (m/M ratio >1) (m/M ratio <1)
Chapter 1: Introduction
61