The aim of this project was to study sequence variation at five genes (Pto, Prf, Fen, Pfi and
Rin4), which are involved in one disease resistance pathway in the wild tomato species
S. peruvianum. For this purpose, individuals from one population of this species (LA2744),
which had been shown to harbour a high proportion of all reported variation in this species (Rose et al. 2005; Rose et al. 2007), were chosen for analysis. All five genes were amplified,
cloned and sequenced from ten individuals in this population. Both alleles were recovered from every individual.
2.3.1 Plant materials and sequencing
Plants of S. peruvianum were grown from seed collected from a single, large population in Tarapaca, Chile by Dr. Charles R. Rick. Seeds were stored at the Tomato Genetics Resource Center (TGRC; http://tgrc.ucdavis.edu) until 1996, at which time 10 seeds from different field collected plants were grown under standard greenhouse conditions in Davis, CA. DNA was isolated using the CTAB method (Doyle & Doyle 1987) from 2 g of leaf tissue collected from each plant. The DNA was resuspended in 300 to 1000 µl TE depending on yield. For outgroup comparisons, individuals of the following species were used: S. hirsutum from
Ancash, Peru (LA 1775), S. pennellii from Arequipa, Peru (LA3791) and S. lycopersicoides
from Tarapaca, Chile (LA 2951). Plant growth conditions and DNA extraction for these outgroups were identical as used for S. peruvianum.
PCR amplification, cloning and sequencing strategies differed slightly for each gene. However, the entire coding region of each gene was amplified using a proofreading polymerase, either Pfu polymerase (Stratagene, LaJolla, CA) or Phusion (Finnzymes, Espoo, Finnland). PCR fragments were cloned into pCR-Blunt or Zero Blunt TOPO (Invitrogen, Carlsbad, CA). Direct sequencing of PCR products and sequencing of minipreped plasmid DNA from clones (Big Dye Terminator v 1.1, Applied Biosystems) were conducted in parallel for each gene. Multiple clones per gene per individual were sequenced and ambiguous positions were compared to the direct sequences from the original PCR products. When necessary, independent rounds of PCRs, cloning and sequencing were conducted to resolve ambiguities. Specific amplification and cloning strategy for each gene are described below.
Sequencing of Pto
The primers SSP17 and JCP32 were initially used to amplify alleles of Pto. These primers
also amplify to a lesser degree two paralogs of Pto, namely Pth3 and Pth5. Plasmids
containing Pto were discriminated from the other paralogs by restriction digest. The
restriction enzyme BstXI specifically digests alleles of Pth3 and Pth5, but not Pto. To
circumvent non-specific amplification of Pto alleles and to facilitate direct sequencing of Pto
for confirmation of homozygosity/heterozygosity respectively, two Pto specific primers in the
upstream region of Pto were developed. These primers, FromPth5A and FromPth5B, were
used in combination with the JCP32 primer, which anneals at the 3' end of Pto.
Sequencing of Prf
Prf is a large gene (5587 from start to stop codon), so it was divided into two overlapping
halves for PCR and these were sequenced separately. The first half of Prf is well-known for
being recalcitrant to cloning, so a direct sequencing strategy, combined with allele specific primers to resolve phase, was used on this half. Both direct sequencing of PCR products and cloning were employed to generate the data for the second half of the gene (approximately 58% of the gene). A large number of primers (>90) were designed for sequencing and allele specific amplification.
Sequencing of Pfi
Pfi is also a large gene (5428 bp from start to stop codon), so a similar sequencing strategy as used for Prf was applied to Pfi. The gene was divided into two to three overlapping fragments
for PCR and these were sequenced and cloned separately. Primers were designed based upon the GenBank mRNA sequence AY662518 from S. lycopersicum cv. Rio Grande 76R and can
be found in Table A2 in the appendix.
Sequencing of Fen
The primers SSP17 and SSP19 were used initially to amplify alleles of Fen. Cloning of these
PCR products revealed that these primers did not specifically amplify alleles of Fen.
Ultimately two additional Fen-specific primers were designed, one upstream of Fen and one
downstream of Fen, based upon the GenBank sequence AF220602 of this region from the Rio
Grande 76R haplotype. These two intergenic primers, FenFor and FenRev, were used in combination or with SSP19 or SSP17, respectively.
Sequencing of Rin4
Rin4 was originally described and cloned from A. thaliana (Mackey et al. 2002). To identify
the putative tomato Rin4 homolog, BLAST was used to search the tomato BAC database on
the SOL Genomics Network website (http://www.sgn.cornell.edu). The gene prediction program GeneMark (http://exon.gatech.edu/GeneMark) was used to predict the open reading frame of the putative tomato Rin4 gene. Primers were designed based upon the tomato
genomic sequence and the incorporated gene prediction information. Two primers (Rin4For3 and Rin4Rev5) were used to amplify the entire coding sequence of Rin4. A combination of
internal primers was used for sequencing of miniprepped clones.
Reference loci
For comparisons between loci, the sequences of alleles of 14 other loci (CT066, CT093, CT099, CT114, CT143, CT148, CT166, CT179, CT189, CT198, CT208, CT251, CT268 and sucr) were obtained from (Baudry et al. 2001) and (Roselius et al. 2005) (Table 2.1). These
reference genes were amplified from five of the same individuals of this population of S. peruvianum. These genes are single-copy cDNA markers previously developed and mapped
in (Tanksley et al. 1992).
Table 2.1. Reference loci used in this study and their predicted gene products.
Locus Chromosome Length [bp] Putative encoded protein
CT066 10 1346 Arginine decarboxylase
CT093 5 1415 S-adenosylmethionine decarboxylase proenzyme
CT099 12 1354 Copper binding protein
CT114 7 1169 Phospho-glycerate kinase
CT143 9 1821 Sterol C-14 reductase
CT148 8 1497 Copper/zinc superoxide dismutase
CT166 2 2673 Ferredoxin-NADP reductase
CT179 3 995 Tonoplast intrinsic protein ∆-type
CT189 12 1463 40S ribosomal protein S19
CT198 9 779 Submergence induced protein 2-like
CT208 9 1767 Alcohol dehydrogenase, class III
CT251 2 1779 At5g37260-like gene (transcription factor involved in circadian regulation)
CT268 1 1887 Receptor-like protein kinase
2.3.2 Population genetic analysis of the pathway
The standard summary statistics including π,Tajima’s D, Fu and Li’s D, Fu’s F test statistics were calculated using DnaSP v. 5.10 (Librado & Rozas 2009). Coalescent simulations were used to examine whether the pattern of substitutions at synonymous and nonsynonymous sites at the resistance genes differed from the 14 other genes from these same individuals. For synonymous sites, the arithmetic mean of π of the 14 non R genes was used as the estimate of theta for the simulations. A total of 1,000 simulations were executed in DnaSP and subsequently it was determined whether the value of π observed at the resistance genes fell within the 95% confidence interval of the simulations based on theta estimated from the 14 non R genes. For these simulations, no recombination was assumed, the most conservative assumption. The same approach was also used to test if π at nonsynonymous sites (πa) was different for the resistance genes versus the arithmetic mean across these 14 non R genes.
McDonald-Kreitman tests and the sliding window analyses were also conducted using DnaSP. Linkage disequilibrium was evaluated using TASSEL v. 2.1 (http://www.maizegenetics.net/).
2.3.3 Phylogenetic inference
To evaluate the genes’ phylogenetic relationships within the population sample and between the sample and other Solanum species, phylogenies can be reconstructed for each gene. This analysis included sequence data from the S. peruvianum population as well as sequence data
from other Solanum species: S. chmielewskii LA3653, S. lycopersicum LA3343,
S. lycopersicum LA1221, S. hirsutum LA1777, S. pennellii LA0716 and S. pimpinellifolium
LA0400. Phylogenetic analyses were completed using PAUP v. 4.0b10 (Swofford, 1999). The phylogenetic relationships between these sequences were determined using maximum parsimony (MP) and neighbor-joining (NJ) and these methods yielded similar topologies.