As functional metagenomics requires the identification of a change in phenotype of the host to identify genes of interest, the cloned genes must be expressed in the host. Successful expression requires host transcription and translation factors to recognise signals in the
cloned DNA. Additionally, the host must use the same codons, produce the correct co- factors, provide conditions for proper folding, secrete the protein if necessary and not be inhibited by the gene products. Thus, it is not surprising that functional metagenomic
surveys frequently have low hit rates, <2 from 10,000 screened clones (Vester et al., 2015b).
1.5.2.2.1.1 Expression Vectors
When creating a metagenomic library, the desired insert size is an important consideration when choosing a vector. Plasmids are used to maintain small inserts (≤15 Kb) while larger inserts can be obtained using fosmids and cosmids (≤40 Kb) and bacterial artificial
chromosomes (BACs; up to 200 Kb) (Ekkers et al., 2012; Leis et al., 2013). The choice of insert size greatly determines the outcome of a functional metagenomic screen.
Small inserts contain only a few genes which can be overexpressed by inducible promoters on multicopy plasmids increasing the chance of their identification. However, these small inserts will not contain full biosynthetic gene clusters and will also lack the genetic context which helps to identify the likely source of cloned DNA libraries (Taupp et al., 2011). Additionally, more clones must be screened in order to achieve comparable coverage to larger insert libraries.
Larger inserts can contain full biosynthetic gene clusters and their greater genetic context may provide more confidence on the source of the cloned DNA as well if genes of interest are located on MGEs. However, the genes present on large inserts likely rely on their native promoter sequences for transcription as the promoters on the vectors will not be able to
drive expression of all genes on the insert due to differences in gene orientation and the presence of premature transcription stop signals (Liebl et al., 2014).
Vectors and cassettes utilising T7 RNA polymerase (RNAP)-promoter systems, anti- termination proteins and convergent promoters have been developed to increase the expression of genes in large inserts (Liebl et al., 2014; Terrón-González et al., 2013). Additionally, the use of shuttle vectors that replicate in more than one host have been developed and used to increase the hit rate of functional metagenomic screens (Liebl et al., 2014).
Table 1.9 is a non-exhaustive list of vectors that have been used in metagenomic studies or show promise as tools for functional metagenomic studies.
Table 1.9 Vectors and Cassettes as Tools for Functional Metagenomic Studies.
Vector/Cassette Hosts Traits Reference/
Company
pCC1BAC™ E. coli Single Copy, Inducible to multicopy,
≤200 Kb, T7 promoter*.
Epicentre®
pCC1FOS™ E. coli Single Copy, Inducible to multicopy, ≤40
Kb, transfection, T7 promoter*.
Epicentre®
pWEB™ Cosmid E. coli Single Copy, Inducible to multicopy, ≤40 Kb, transfection, T7 promoters*.
Epicentre®
pMPO579 E. coli pCC1FOS™ based shuttle vector, T7 promoter, oriT for conjugative transfer,
nutL site for phage transcription anti-
termination protein N assembly. Promoterless gfp for substrate induced gene expressions (SIGEX).
(Terrón- González et
al., 2013)
pJOE930 E. coli Convergent inducible lac promoters flanking cloning site.
(Lämmle et
al., 2007) pHT01 E. coli, B. subtilis ColE1 replicon, inducible groE promoter MoBiTec
GmbH pRS44 E. coli, Pseudomonas fluorescens, Xanthomonas campestris
pCC1FOS™ based shuttle vector, oriT for conjugative transfer, parDE stabilising elements
(Aakvik et
al., 2009)
pCT3FK E. coli, Thermus thermophilus
pCC1FOS™ based shuttle vector, contains pyrF and hyp genes of T. thermophilus for chromosomal
integration (Leis et al., 2015) pJWC1 Agrobacterium tumefaciens, Burkholderia graminis, Caulobacter vibrioides, E. coli, Pseudomonas putida, Ralstonia metalliduran
RK2 replicon, pTR101 derived cosmid (Craig et al., 2010)
Transfer and Expression of biosynthetic pathways (TREX) E. coli P. putida Rhodobacter capsulatus
Two cassette system that labels and mediates conjugative transfer of gene clusters to non-E. coli hosts. Convergent T7 promoters.
(Loeschcke
et al., 2013)
1.5.2.2.1.2 Metagenomic Library Hosts
E. coli is the most utilised host for creating metagenomic libraries as it is well characterised and amenable to genetic manipulation. However, using a single host will not allow for the expression of every cloned gene. Indeed, Gabor et al. (2004) estimated that from the genes of 32 different genera, E. coli would be able to express only 40 % of them (Gabor et al., 2004). This figure is likely to be an overestimate as the authors only looked at expression signals and did not account for differing cofactors, chaperones etc.
The expression of alternative sigma factors and chaperones in E. coli has been
demonstrated to increase the expression of cloned genes. For example, the use of E. coli EPI300 T1R harbouring additional sigma factors from Clostridium and Streptomyces spp. had a 20 – 30 % increased hit rate in functional metagenomic screens to identify hydrolytic enzymes (Liebl et al., 2014). In addition to altered E. coli hosts, alternative hosts may also be used to create metagenomic libraries, Table 1.10. In fact, the use of alternative hosts may be required when screening for certain functions such as the identification of cold enzymes, as E. coli does not grow well at low temperature (Strocchi et al., 2006).
Table 1.10 Alternative Hosts Used in Functional Metagenomic Surveys.
Host Metagenome Novel Genes/Phenotypes References
T. thermophilus Hotspring sediment
and water
Esterases (Leis et al.,
2015)
B. subtilis Deciduous forest
Soil
Antimicrobial Production (Biver et al.,
2013)
P. putida Wheat Soil Polyhydroxyalkanoate Synthases (Cheng and
Charles, 2016)
S. lividans Alaskan Soil Sigma Factors and Haemolysans (McMahon
et al., 2012)
A. tumefaciens Pennsylvanian soil Pigementation (Craig et al.,
2010)
Rhizobium leguminosarum Anaerobic Sludge Alcohol/Aldehyde Dehydrogenase (Wexler et
al., 2005)
B. graminis Pennsylvanian soil Antimicrobial Production (Craig et al.,
2010)
R. metallidurans Pennsylvanian soil Pigementation, Antimicrobial
Production
(Craig et al., 2010)
1.5.2.2.1.3 Functional Screen
Following the creation of a metagenomic library, appropriate functional screens to identify clones of interest must be designed. Functional screens can be grouped into three
strategies: phenotypic insertion detection, modulated detection and reporter-based screens.
Phenotypic insertion detection involves the observation of a phenotype of interest. These phenotypes may be direct such as altered colony morphology or pigmentation, or indirect by the interaction of a clone’s gene product(s) with a substrate or indicator organism. For example, protease activity can be detected by culturing metagenomic libraries on skimmed milk containing agar while antimicrobial producing clones can be identified by identifying halos in sensitive indicator organism overlays (Arivaradarajan et al., 2015; Waschkowitz et
al., 2009). Such screens are typically low tech and not very sensitive as poorly expressed genes may be missed. However, the coupling of such screens with microfluidic approaches can increase throughput and sensitivity. Scanlon et al. (2014) developed an antimicrobial screen whereby a metagenomic clone and indicator organism are immobilised in a gel droplet. Staining of the droplets with a fluorescent dye to identify lysed cells allowed for the screening of > 5 million clones in one day (Scanlon et al., 2014).
Modulated detection involves the identification of genes of interest based on their expression allowing the host cell to survive under certain conditions. For example, a cold- sensitive E. coli mutant unable to grow at temperatures below 20°C has been used to
identify DNA polymerases from a glacial ice metagenome that allowed the mutant to survive low temperatures (Simon et al., 2009). Similar functional screens have been designed to identify genes that allow their heterologous host to survive on exogenous lysine and to utilise ethanol (EtOH) as a carbon and energy source (Chen et al., 2009; Wexler et al., 2005). The addition of antimicrobials to a media also allows for the discovery of clones expressing antibiotic resistance genes (Card et al., 2014).
Reporter-based screens utilise a reporter gene such as gfp (green fluorescent protein) coupled to the activity of a cloned gene promoter or to a cloned gene’s product. For
example, SIGEX involves the cloning of genes upstream of a promoter-less gfp, linking their expression. Following induction by a substrate, induced genes can be identified by
fluorescence associated cell sorting (FACS). SIGEX has been used to identify many genes including those induced by hydrocarbons (Meier et al., 2015). Similarly, product-induced gene expression (PIGEX) utilises a system whereby gfp transcription is under the control of a
promoter sensitive to a gene product of interest. PIGEX has been used to identify a novel amidase enzyme involved in benzoate synthesis (Uchiyama and Miyazaki, 2010).