Participants were from the Avon Longitudinal Study of Parents and Children (ALSPAC), a population-based birth cohort study (Boyd et al., 2013). After exclusion of participants who had withdrawn consent, the sample for this current investigation consisted of 8,935 individuals with genotype data, of whom 4,577 were males and 4,358 females. Of these individuals, 4,070 had measures of MSE at age 15 years, obtained through non-cycloplegic autorefraction (Table 3.1).
Table 3.1: Summary of refractive error in the XWAS analysis sample. Males Females Total N total (%) 4,577 (51.2%) 4,358 (48.8%) 8,935
N with MSE at age 15 years (%) 1,924 (47.3%) 2,146 (52.7%) 4,070
Mean MSE (SD) (D) -0.370 (1.30) -0.392 (1.26) -0.382 (1.28)
Phenotyping, genotyping and imputation were carried out by members of the ALSPAC team and as outlined in Section 2.1.
After imputation, genotypes were available for 1,250,218 variants across chromosome X. All variants were mapped to NCBI build 37 genome coordinates.
3.2.1 Quality Control
To maintain a high level of data quality, quality control was performed on all variants for all genotyped individuals. Individuals missing information for 10% or more of the variants were excluded from the analysis. Variants were excluded from the analysis if they were not genotyped successfully in at least 95% of individuals, their imputation quality (IMPUTE2 INFO) metric was < 0.5, their MAF was < 0.01 or
their HWE p-value in females was < 1 x 10-6. As regards HWE, it should be noted that the comparatively lenient threshold applied was chosen since it takes into consideration the very high number of variants tested; an analogous value has been utilised in other investigations, e.g. Davidson et al. (2014). Furthermore, if variants are in HWE in females and allele frequencies are equal between males and females then allele frequencies should remain constant through future generations (Zheng et al., 2007). In order to verify this assumption, the correlation in MAFs between males and females was confirmed to be high, as shown in Figure 3.1 (r = 0.998; 95% CI: 0.998-0.998).
3.2.2 Single Marker Tests
For all variants passing quality control, tests for association were performed separately for males and females using a frequentist linear regression model and expected counts used in the presence of missing genotypes, implemented in SNPTEST v2.5. Genotypes for males were coded as 0, 2 (AA, BB); whereas females were coded as 0, 1, 2 (AA, AB, BB). As the ALSPAC cohort of children was age matched by design, and non-ancestry matched individuals were excluded by ALSPAC researchers, additional covariates were not included in analyses.
The separate association test results for males and females were then combined in a fixed effects, standard error-weighted meta-analysis using METAL (Willer et al., 2010). Manhattan and quantile-quantile (QQ) plots were generated using the summary statistics generated from each stage of analysis.
Figure 3.1: Plot of minor allele frequencies (MAFs) between males and females.
Red line = line of unity (Females MAF = Males MAF). N = 4,070 across 267,185 variants.
3.2.3 Sex-Specific Effects
In order to evaluate sex-specific effects, the separate male and female association test results were further analysed using the R package EasyStrata (Winkler et al., 2015). This evaluation took the beta estimates and their standard errors from each SNPTEST analysis to generate a Z-statistic value (Equation 3.1). From this test statistic, a p-value was obtained for each locus relating to the difference in beta estimates between the two sexes (i.e. the likelihood of beta estimates being
identical between males and females), based on a standard normal distribution. Lower p-values signified loci whose beta estimates were more likely to be different between males and females. A false discovery rate of 5% was applied in order to define whether a sex based effect was present after accounting for multiple testing.
𝒁 = 𝜷𝒎𝒂𝒍𝒆− 𝜷𝒇𝒆𝒎𝒂𝒍𝒆 √𝑺𝑬𝒎𝒂𝒍𝒆𝟐+ 𝑺𝑬
𝒇𝒆𝒎𝒂𝒍𝒆𝟐
Equation 3.1: Calculating the test statistic determining differences in beta estimates
between males and females (Z). β = beta (effect) estimate; SE = standard error from association test summary statistics.
3.2.4 Gene-based and Gene-set Analyses
Gene-based tests were subsequently performed for this data set using VEGAS2 and MAGMA (outlined in Section 2.3.5). In both instances, LD patterns were estimated using European ancestry reference panels, specifically reference files composed of data for the 379 unrelated individuals of European ancestry from Phase 1, Version 3 of the 1000 Genomes Project (The 1000 Genomes Project Consortium et al., 2012).
Potential functional properties of X-chromosome genes associated with myopia in the MAGMA analysis were further investigated using competitive gene-set analysis in MAGMA (as outlined in Section 2.3.5.2). Adjustment for multiple testing was applied using a false discovery rate of 5% for these gene-based and gene-set test results.
3.2.5 Power Calculation
In order to determine how much statistical power this current investigation had, a power calculation was performed using QUANTO (Gauderman, 2002). The following assumptions were applied when performing the calculation: a type I error rate ()
of 5 x 10-8, an effect size of 0.10 D, a MAF range between 0.01 and 0.50, a sample size of 4,070 individuals, and that MSE had a normal distribution with a mean (SD) of -0.382 (1.28) D. The Type I error rate was selected to match the p-value required to declare significance for the association test performed after adjustment for multiple testing and accounting for LD. The effect size estimate was based on the average effect size from a meta-analysis for refractive error undertaken by the CREAM consortium (Verhoeven et al., 2013). Sample size and MSE estimates were obtained from the ALSPAC sample used for this investigation.
An additional power calculation was performed in order to determine the minimum sample size required in order to have 60-80% power to detect an effect size of 0.10 D. As above, the population MSE estimate was based on the ALSPAC sample mean (SD) of -0.382 (1.28) D.