Expression studies at the chromosome 9p21 locus included two populations of healthy adult volunteers: the 310 individuals of the SA cohort and the 177 individuals from the NE Caucasian cohort. Preliminary feasibility and optimisation work involving
MLH1expression were performed using samples from affected cases and unaffected
controls of the CAPP study. Details of these populations and the nucleic acid preparation were presented in Chapter 2.
3.3.2 Selection of transcribed SNPs for allelic expression analysis
Using the NCBI Entrez Gene database196, transcribed transversion SNPs with
expected heterozygosity >0.2 in Caucasian populations were selected as suitable candidates for assessment of allelic expression. Variants lacking population
frequency data were excluded. Insertion/deletion polymorphisms were also excluded since these would alter the size of the PCR product produced for each allele, which
could bias the AEI estimate. The NCBI196 and Ensembl43 databases were also used to
establish the location of transcribed SNPs with respect to reported transcript variants of each gene.
Transcribed polymorphisms in ANRIL, which was not annotated in the databases at
the time of the design, were identified by comparing the reported mRNA sequence195
with NCBI dbSNP using the BLAST tool. The expected heterozygosity of ANRIL exonic SNPs identified in this way was then checked in dbSNP.
Transcribed SNPs selected using these criteria were: rs3088440 and rs11515 in exon 3 of CDKN2A; rs3217992 and rs1063192 in exon 2 of CDKN2B; rs10965215 and rs564398 in exon 2 of ANRIL. The two CDKN2A SNPs are also present in ARF, allowing the assessment of cis-acting influences on both of these transcripts. Another SNP rs10738605 in exon 6 of ANRIL also satisfied these criteria but was excluded from initial AEI experiments because of extensive skewing of the allelic ratio in genomic DNA.
Chapter 3: Preliminary AEI studies
94
3.3.3 Selection of mapping SNPs
3.3.3.1 SNPs associated with disease
SNPs previously reported to be associated with disease phenotypes were selected for
mapping effects on expression124, 125, 160, 169-173, 175-182, 312-314. GWA studies for CAD
showed association for multiple SNPs which were in strong LD in the chromosome 9p21 region. To reduce redundancy of genotyping but ensure inclusion of the most important SNPs, the ‘lead’ SNPs showing the strongest association with disease, subsequent refinement SNPs, and SNPs reported as tagging the risk haplotype were selected from GWA studies. All SNPs reported to have significant associations with disease phenotypes in candidate gene studies (as previously summarised in Table 1.4 on page 33) were included.
3.3.3.2 SNPs altering transcription factor binding sites within regulatory regions
SNPs within previously reported regulatory elements such as the CDKN2A, ARF and
CDKN2B promoters186-189 or a putative ANRIL promoter region (which was arbitrarily
defined as 1kb up and downstream of the transcription start site for the purposes of this study) were also selected. SNPs in these regions were included if they were reported more than once in NCBI dbSNP, had expected heterozygosity >5%, and the alternative SNP alleles were predicted to alter human transcription factor binding sites
using PROMO v.3.0.2 software246, 247.
3.3.3.3 Tag SNPs to capture common variation in the region
Additional tag SNPs required to capture common variation in the core region of interest (Chr9:21958155-22115505) based on HapMap CEU data were also selected using HaploView 4.0 Tagger software using the following parameters: minimum
minor allele frequency 0.01, pairwise tagging, r2 threshold >0.8. Transcribed or
functional SNPs already selected for genotyping were ‘force included’ as tag SNPs in Tagger to reduce redundancy of genotyping. Tagger output SNPs were manually checked for potential problems with Sequenom genotyping (e.g. rs6475608 had multiple deletions/repeats which would be problematic for Sequenom SNP analysis). Tagging was then repeated with SNPs likely to be problematic ‘force excluded’ from
Chapter 3: Preliminary AEI studies
95
selection, to obtain a final panel of typable SNPs which adequately tagged the region. The SNPs selected for genotyping and reasons for including them are shown in the genotyping results summary table in the next chapter (Table 4.2 on page 130).
3.3.4 Genotyping
Multiplex SNP genotyping was performed using Sequenom methodology as described in Chapter 2. The 56 SNPs were genotyped in five separate reactions (W1-W5).
3.3.5 AEI assays
PCR primers for the selected transcribed SNPs were designed manually and using
Primer3 (v.0.4.0) software289. For AEI analysis PCR primers were designed to anneal
across exon boundaries, thus being specific for cDNA and not binding to genomic DNA. Primers for genomic DNA normalisation were designed to produce products as similar as possible in size and sequence to those produced by the cDNA specific primers, to minimise differences in reaction kinetics. CDKN2A primers span exons 3- 4 and include both transcribed SNPs (rs3088440 and rs11515) in the same amplicon.
ANRIL primers span exons 1-2 and include both transcribed SNPs (rs10965215 and
rs564398) in the same amplicon. For CDKN2B, the distance of transcribed SNPs from the exon boundary meant that amplicons would be more than 1kb in size, and separate primer pairs for transcribed SNPs rs1063192 and rs3217992 were therefore designed entirely within exon 2. These CDKN2B primers were therefore not cDNA specific, and analysis was performed using cDNA from DNase treated RNA. Measurements were performed in four replicates using 50ng of cDNA template.
3.3.6 Genomic DNA normalisation assays
Genomic DNA normalisation reactions for CDKN2B used the same PCR primers as used for cDNA, but for CDKN2A and ANRIL (where primers were cDNA-specific) separate assays designed to be as close as possible in size and location to the cDNA primers were used. Genomic normalisation samples were checked both in uniplex reactions and multiplex combinations when these were used for the AEI analysis.
The appropriateness of genomic normalisation ratios and linearity of the AER response were checked by mixing PCR products in varying ratios from individuals
Chapter 3: Preliminary AEI studies
96
homozygous for the minor and major alleles at each SNP, and using these as template for the allelic expression assays. To do this, PCR was performed from cDNA samples homozygous for each transcribed SNP allele, using standard uniplex AEI protocols but with only 40 cycles. The concentration of the PCR product from each
homozygous sample was quantified using PicoGreen, performed in three replicates. For each pair of alleles at each SNP, the product with the highest concentration was diluted with 1xTE to obtain the same concentration as the less concentrated product. Further cycles of PicoGreen measurement and dilution were repeated until the
measured concentration of each sample was the same. Volumes of product from each allele pair were then mixed in known ratios (8:1, 4:1, 1:1, 1:4, 1:8), and for each mixture serial dilutions were performed with 1xTE until the molarity with respect to
the DNA amplicons roughly equated to the molarity of genomic DNA in a 20ng/µL
solution. The resulting allele mixes were then used as template for PCR and AEI analysis following the standard protocol.
3.3.7 Statistical analyses
Allelic expression analysis was performed using the pre-phasing methodology described in Chapter 2. A novel approach of combining allelic ratios from the two transcribed markers in each gene was used to increase the number of informative heterozygotes. This included AERs measured at both transcribed SNPs in the analysis of each gene. For individuals who were heterozygous for both transcribed SNPs, information from both was included allowing for different variances at each transcribed marker. The best fitting parameters were determined using likelihood maximisation algorithms.