Mutation hotspots can be defined as regions of the genome where mutations are found to occur frequently (273). It is thought that hotspots are a result of some inherent signature or instability associated with the DNA sequence itself, rendering the region more susceptible to mutations. For instance, DNA sequences, such as repetitive sequences, may contain motifs that are sites for homologous recombination (274, 275) and can be putative mutation hotspots. Pathogenicity islands and prophages, via
127 homologous recombination in tRNA genes, can integrate into chromosomes and are also frequently found to be mutation hotspots (276). DNA structure may also play a role, as it is possible that chromosome folding may leave some regions more exposed and thus, more prone to mutations (274).
In this study, across all 48 MA lineages, 34 genes were mutated more than once. In most of the genes, parallel mutations, that is identical mutations that have arisen independently, did not occur. However, there were some instances where identical mutations were found in independent lineages; these regions could be possible mutation hotspots. In this study, putative hotspots were identified for three BPS, two indel and 12 GCR events (Table 4.3).
For putative BPS hotspots, two were found solely under anaerobic growth conditions while one was identified under both aerobic and anaerobic growth conditions. Potential BPS hotspots were found in the intergenic region between tdcA and tdcR, the intergenic
region between yieP and rrsC and in citF (Table 4.3), with the latter being a non-
synonymous mutation found only in anaerobic lineages, suggesting that there may be some selective pressures acting on the lineages in this study. Potential indel hotspots, most likely DNA regions prone to slippage during DNA replication, were found under both aerobic and anaerobic growth conditions and included a run of eight adenine nucleotides in the intergenic region between the umuD and ycgN genes and a pair of
guanine nucleotides in the intergenic region between the trkD and insJ-5 genes (Table
4.3).
For putative GCRs hotspots, two were found solely under aerobic growth conditions, five were found only under anaerobic growth conditions and five were found in both aerobic and anaerobic growth conditions. Three of the possible hotspots were specifically for deletions (Table 4.3). Regions between the ECB_01527 and
ECB_01510 genes were deleted in four separate lineages while a 27 gene deletion
between the yegR and yegQ genes occurred twice in anaerobic lineages. In two separate
aerobic lineages, regions between the ybcQ and ompT genes were deleted. All of the
deleted regions involved either putative pathogenicity island genes or prophage genes, consistent with commonly encountered hotspots. The other eight potential GCR hotspots specifically involved IS150 transposition (Table 4.3). The intergenic region
128 observed in 10 lineages. This region appears to have an affinity for IS150 insertion as
this mutation has been observed in other studies in E. coli as well (35, 188, 205). For
the remaining possible hotspots, there were three parallel instances of an IS150 insertion
in the intergenic region between mokC and nhaA while ynjII, pflB, menC, yfcC, rhaS, cycA and the intergenic region between nupC and yfeA had two instances of IS150
transposition each (Table 4.3).
It is possible that the multiple occurrences of certain mutations are not putative hotspots but rather an indication of selective pressures acting on the lineages in this study. However, based on the calculated ratio of synonymous to non-synonymous ratio BPSs, it would appear that selection in the MA study was minimal (section 4.2.1.1.3). Nonetheless, it would be possible to determine if a particular mutation is subject to selection by recreating the mutation in the ancestral strain and by using competitive fitness assays (section 2.2.13) to determine whether the mutation contributes to fitness. Additionally, in silico prediction techniques (274) could be used to search for motifs
that are known to be hotspots for mutations, such as repeat regions or genomic islands (274). On the whole, these putative hotspot mutations don’t appear to be very strong as they are only found in a small number of lineages and are not localised to specific regions of the genome.
129 Table 4.3. Mutations identified in multiple MA lineages.
Mutation Type
REL4536
Reference Genes† Reference Position 1* Mutation Lineage
BPS citF 622,489 A Æ G AN-144-16
BPS citF 622,489 A Æ G AN-144-20
BPS tdcA & tdcR 3,172,999 C Æ A AN-144-28 BPS tdcA & tdcR 3,172,999 C Æ A AN-144-38 BPS trkD & insJ-5 3,869,337 A Æ G AE-180-26 BPS trkD & insJ-5 3,869,337 A Æ G AN-144-24 Indel umuD & ycgN 1,504,657 A(8) Æ A(9) AE-180-38 Indel umuD & ycgN 1,504,657 A(8) Æ A(9) AN-144-50 Indel trkD & insJ-5 3,866,357 G(2) Æ G(3) AE-180-22 Indel trkD & insJ-5 3,866,357 G(2) Æ G(1) AE-180-26 Indel trkD & insJ-5 3,866,357 G(2) Æ G(3) AN-144-46 Indel trkD & insJ-5 3,866,358 G(2) Æ G(1) AE-180-12 Indel trkD & insJ-5 3,866,358 G(2) Æ G(1) AE-180-44 Indel trkD & insJ-5 3,866,358 G(2) Æ G(1) AN-144-16 GCR mokC & nhaA 16,972 IS150 insertion AN-144-04 GCR mokC & nhaA 16,972 IS150 insertion AN-144-40 GCR mokC & nhaA 16,992 IS150 insertion AE-180-04 GCR ybcQ & ompT 547,700 Deletion of 5 genes AE-180-34 GCR ybcQ & ompT 547,702 Deletion of 5 genes AE-180-22 GCR yegR & yegQ 632,692 Deletion of 27 genes AN-144-46 GCR yegR & yegQ 632,699 Deletion of 27 genes AN-144-50
GCR ynjI 910,345 IS150 insertion AN-144-24
GCR ynjI 910,345 IS150 insertion AN-144-32
GCR ECB_01527 & ECB_01510 1,117,789 Deletion of 13 genes AE-180-04 GCR ECB_01527 & ECB_01510 1,117,789 Deletion of 13 genes AE-180-14 GCR ECB_01527 & ECB_01510 1,117,802 Deletion of 13 genes AN-144-50 GCR ECB_01527 & ECB_01510 1,117,803 Deletion of 10 genes AE-180-30 GCR trg & mokB 1,272,399 IS150 insertion AN-144-50 GCR trg & mokB 1,272,453 IS150 insertion AE-180-24 GCR trg & mokB 1,272,453 IS150 insertion AN-144-12 GCR trg & mokB 1,272,453 IS150 insertion AN-144-34 GCR trg & mokB 1,272,455 IS150 insertion AN-144-46 GCR trg & mokB 1,272,467 IS150 insertion AN-144-44 GCR trg & mokB 1,272,468 IS150 insertion AE-180-30 GCR trg & mokB 1,272,470 IS150 insertion AN-144-20 GCR trg & mokB 1,272,470 IS150 insertion AN-144-30 GCR trg & mokB 1,272,470 IS150 insertion AN-144-40
GCR pflB 1,764,888 IS150 deletion AN-144-38
GCR pflB 1,764,888 IS150 deletion AN-144-48
GCR menC 2,295,162 IS150 insertion AE-180-08
GCR menC 2,295,162 IS150 insertion AE-180-40
GCR yfcC 2,334,210 IS150 insertion AN-144-04
GCR yfcC 2,334,210 IS150 insertion AN-144-08
GCR nupC & yfeA 2,421,315 IS150 insertion AN-144-08 GCR nupC & yfeA 2,421,323 IS150 insertion AE-180-06
GCR rhaS 4,043,793 IS150 insertion AN-144-14
GCR rhaS 4,043,794 IS150 insertion AN-144-24
GCR cycA 4,381,583 IS150 insertion AE-180-06
GCR cycA 4,381,587 IS150 insertion AN-144-40
130