• No se han encontrado resultados

PLANTEAMIENTO DEL

5. SIMPLANT PRO 1 DESCRIPCIÓN

The allele sizes for all markers for each genotype were used to analysis basic statistics using Power Marker version 3.25 (Liu and Muse, 2005). The summary statistics includes analysis of polymorphic information content (PIC), allelic richness as determined by total number of the detected alleles, number of alleles per locus, occurrence of unique, rare, common and most frequent alleles, gene diversity and heterozygosity (%).

3.7.4.1 Polymorphic information content (PIC)

The polymorphic information content (PIC) was estimated as below (Botstein et al., 1980). PIĈl = 1 − ∑ Plú 2 k u=1 − ∑ ∑ Plú 2Plv́ 2 k v=u+1 k−1 u=1 3.7.4.2 Gene diversity

Gene diversity often referred to as expected heterozygosity, is defined as the probability that two randomly chosen alleles from the population are different. An unbiased estimator of gene diversity at the lth locus is

63 D̂l = (1 − ∑ Plú 2 ) k u=1 (1 − 1 + f n ) ⁄ 3.7.4.3 Heterozygosity

Heterozygosity is simply the proportion of heterozygous individuals in the population. At a single locus and it was estimated as

l = 1 − ∑ P̂luu

k

u=1 3.7.4.4 Allele and genotype frequencies

The sample allele frequencies are calculated as𝑃̃𝑢 = 𝑛𝑢⁄(2𝑛), with the variance estimated as

Var (P̃u) =̂. 1

2n(P̃u + P̃uu− 2P̃u

2)

Where =̂. means “estimated by”.

The sample genotype frequencies 𝑃̃𝑢𝑣 are calculated as 𝑛𝑢𝑣⁄ . Both 𝑃̃𝑛 𝑢

and 𝑃̃𝑢𝑣 are unbiased maximum likelihood estimates of the population

frequencies. Confidence interval for allele and genotype frequency was formed by resampling individuals from the data set.

3.7.4.5 Unique, rare and common alleles

Unique alleles are those that are present in one accession or in one group of accessions but absent in other accessions or group of accessions. Rare alleles are those whose frequency is ≤ 1 percent in the investigated materials. Common alleles are those occurring between 1-20 percent in the investigated materials while those occurring >20 percent was classified as most frequent alleles (Upadhyaya et al., 2008).

3.7.4.6 Clustering

The unweighted neighbor-joining tree was constructed based on the simple matching dissimilarity matrix of 14 SSR markers genotyped in 336 genotypes of GSP using DARwin 5.0.156 program (Perrier and Jacquemoud Collet, 2006).

64

3.7.4.7 Principle coordinate analysis (PCoA)

The PCoA of genotypes of GSP was performed based on genetic distance values obtained from DARwin distance matrix using GENAlex 6.41 (Peakall and Smouse, 2006).

3.7.4.8 Analysis of molecular variance (AMOVA)

Analysis of molecular variance was performed to partition molecular variance within and among the subspecies of cultivated groundnut and populations identified by the cluster analysis based on 999 permutations using the software GENAlex 6.41 (Peakall and Smouse, 2006)

3.7.4.9 Population structure analysis

A set of 14 SSR markers identified to be linked with resistance to rust, LLS, and nutritional quality traits were used to know structure of population and admixture for targeted genomic regions. In order to infer precise population structure of GSP for targeted genomic regions, only molecular data were used without considering pre-existing available information on diversity based on botanical classification, geographical information in the analysis. The analysis was performed using the software package STRUCTURE 2.3.4. The program STRUCTURE implements a model-based clustering method for inferring population structure using genotype data consisting of unlinked markers to identify k clusters to which the program then assigns each individual genotype. The method was introduced by Pritchard et al., (2000) and extended by Falush et al., (2003 & 2007). To determine most appropriate k value, burn-in Markov Chain Monte Carlo (MCMC) replication was set to 10,000 and data were collected over 100,000 MCMC replications in each run. Ten independent runs were performed setting the number of population (k) from 2 to 10 using a model allowing for no admixture and correlated allele frequencies. The basis of this kind of clustering method is the allocation of individual genotypes to k clusters in 4such a way that Hardy-Weinberg equilibrium and linkage equilibrium are valid within clusters, whereas these kinds of equilibrium are absent between clusters. The k value was determined by LnP(D) in STRUCTURE output and an ad hoc statistic jk based on the rate of change in LnP(D) between successive k

65

(Evanno et al., 2005). The final subpopulations were determined based on rate of change in LnP(D) between successive k and stability of grouping pattern across five-run.

3.7.4.10 Marker-trait association

Association of SSR marker data with the trait of interest was tested using the general linear mixed model (GLM) as described by Yu et al., (2006) using TASSEL 2.1. This method simultaneously takes multiple levels of both gross level population structure (Q) and finer scale phenotypic data into account. The statistical model can be described in Henderson’s notations (Henderson, 1975) as follows:

y = Xβ + Zu + e Where,

y = the vector of observations

𝛃 = unknown vector containing fixed effects including genetic marker and population structure (Q)

u = unknown vector of random additive genetic effects from multiple backgrounds

QTL for individuals or lines

X and Z = the know design matrices

E = unobserved vector of random residuals.

The population structure analysis was conducted by running STRUCTURE and the population structure matrix (Q) was constructed at K=3. The BLUPs were determined for each accession for all quantitative traits for individual and pooled across environments were used for the association analysis as phenotypic data input. Higher disease severity score among both the replication was considered as the final score of that genotype for a particular location and taken for analysis whereas mean of all three locations used as pooled disease score across locations. However, four environment data of yield and nutritional quality traits were used for marker-trait association analysis for individual as well as pooled across environments. The SSR

66

markers associated with the trait of interest were identified based on P value of marker, which determines whether a marker is associated with the desired trait. The R2 (marker) indicating the fraction of the total phenotypic variation

explained by the marker. Only those makers which having P≤0.05 were selected as significant markers associated with the trait of interest.

67

CHAPTER IV

Results

Phenotyping was done on Genomic Selection Panel (GSP) for yield traits, resistance to foliar fungal diseases (LLS and rust) and nutritional quality traits in four environments. Rainy season trials were conducted at Aliyarnagar (Tamil Nadu) and Jalgaon (Maharashtra) under natural disease epiphytotic, whereas at ICRISAT, Patancheru (Telangana), it was under artificial disease pressure created through infector row technique. Besides, the rainy season experiments, another trial was conducted at ICRISAT, Patancheru during the post-rainy season of 2015-16 to evaluate all genotypes under disease free condition. The phenotypic data collected from each individual environment as well as pooled were used to assess genetic diversity present in the GSP for traits evaluated and to identify the stable source of disease resistance, yield and nutritional quality traits through GGE biplot technique of stability analysis. The genotypes of GSP were subjected to molecular diversity analysis using 14 SSR markers data linked to rust, LLS and nutritional quality traits to assess molecular diversity and allelic richness present in GSP for the targeted loci. Marker-trait association analysis was done to evaluate association among SSR genomic region and phenotype across the environments. The experimental results of present investigation are presented under following headings