Capítulo 1. La embriaguez en la época prehispánica y la época colonial
2.1 Composición social del corregimiento de Toluca
2.1.1 Actividades económicas del corregimiento de Toluca
Haplotypes: CD or cd Haplotypes: Cd or cD
A θ of 50% specifies unlinked loci distantly located on the same chromosome or on separate chromosomes, a consequence of independent assortment. A θ less than 50% means loci are linked with a small physical distance between them.
4.1.2 Linkage models in complex disease
With genetically complex traits, there is unclear evidence of a single locus effect with a specific mode of inheritance, such that the unreliability of conventional linkage analysis requires distinct, model free strategies to be created (Curtis and Sham 1995). For a dichotomous trait genetic complexity is causative of heterogeneity, involvement of multiple genetic loci and environmental exposures, so requires a method where the precise mechanism of disease can remain unknown. Affected sibling pair methods were modelled based on this, followed by likelihood-‐based methods in which the observed data is a probability of θ of two loci by way of reporting a logarithm (base 10) of odds (LOD) score. The LOD score method was first defined in parametric models where allele frequencies, penetrances and inter-‐marker genetic distances (for multipoint analyses) were required (Kruglyak, Daly et al. 1996). This model is successful where a clear inheritance pattern is observed, for example, autosomal dominant or recessive. A non-‐parametric model (NPL, or model free)
C
D
c
d
C
D
c
d
is used for disorders of unclear inheritance and variable penetrance. It assumes that linked loci shared by affected relative pairs share more alleles identical by descent (IBD) at the marker locus than expected by chance only (Risch 1990). The test looks for chromosomal segments shared between affected individuals by distinguishing IBD alleles from identical by state (IBS) alleles. Alleles IBS may look identical but their common ancestor cannot be demonstrated, whereas IBD alleles can be (Ott and Bhat 1999). Only genotypic information from the affected individuals is used for analysis, following the assumption that the affected phenotype is more likely to be associated with the presence of the disease allele. Unaffected individuals are used to provide genotypic information on any un-‐typed parents. Pedigree relationships, allele frequencies and genotypes of all individuals are needed for NPL.
The linkage experiment in this chapter differs to other combined exome/whole genome sequencing and linkage studies in monogenic disease. For monogenic/Mendelian diseases, researchers have combined exome sequencing in pedigrees where positive linkage has been identified to find the causal gene (Louis-‐Dit-‐Picard, Barc et al. 2012), and this has been successful for many autosomal dominant (Johnson, Mandrioli et al. 2010; Wang, Yang et al. 2010), recessive (Bilguvar, Ozturk et al. 2010) and quantitative traits (Bowden, An et al. 2010). Linkage has also been used in complex traits. Recently, rare missense mutations in CARD14 have been implicated in psoriasis, using a variety of strategies, some similar to ones taken in this study. Linkage was identified in PSORS2 chromosomal region, containing CARD14. Target capture and sequencing then identified gain-‐of-‐function mutations in CARD14 segregating with disease (Bertin, Wang et al. 2001; Jordan, Cao et al. 2012). For a complete mutation profile of the gene, exon 4 of CARD14 (where most clustering rare variants were located) was sequenced in 1,856 cases and 882 controls of European ancestry and the entire case (6,000) control (4,000) cohort was genotyped to establish allele frequencies of all rare CARD14 variants. This study was successful in finding a potential causal gene through linkage, establishing that mutations in the gene also occurred outside of families with a genetic predisposition and that harbouring rare variants are likely to confer a high risk
for the phenotype (Jordan, Cao et al. 2012). The study is a successful one for complex disease and mirrors a monogenic disease finding in that one gene (others genes also contribute to psoriasis susceptibility) containing rare variants are possibly detrimental to disease risk. Furthermore, monogenic forms of complex disease, such as in cutaneous lupus erythematosus, where an autosomal dominant inheritance pattern was described by findings from a genome-‐wide linkage search in a large kindred (Lee-‐Kirsch, Gong et al. 2006) provides further evidence of how linkage can identify a single gene responsible for disease susceptibility.
In this chapter, linkage has been performed in large multigenerational coeliac pedigrees using a set of common and rare SNP markers from the Immunochip array. In order the link this information with the exome dataset of 75 CD subjects, genomic regions under the linkage peaks were then inspected for rare exome variants from the exome sequencing dataset (data from Chapter 3, phase two), in the hope of finding segregating variants in one or more genes to take forward for targeted resequencing.
4.2 Aims and hypotheses
Specific aims and hypothesis for three analyses are outlined below:
i) Linkage analysis – the aim is to perform NPL analysis on multiply
affected families using 196,524 Immunochip SNP markers, to infer if a rare coding disease risk variant is shared by affected individuals in the same pedigree under a linkage peak. The linkage information here provides knowledge of shared chromosomal regions, which can be linked to exome sequencing data to search for rare segregating variants carrying a disease risk. The hypothesis is if common SNPs are segregating more than expected by chance then there will be some rare functional variants under that peak also; the rare variant will be IBD in affected individuals.