purine pathways. The pyrimidine pathway leads to the production of dTTP and dCTP, the purine metabolism pathway leads to the production of dGTP and dATP (Fig 2B). T4 encodes a number of enzymes that form the dNTP synthase complex (DSC), a protein complex responsible for the efficient synthesis of 5-hmC (Moen et al., 1988; Reddy et al., 1977; Shen et al., 2004). The DSC consists of at least eight phage-encoded proteins and two host encoded proteins (Supplementary table 1). Phage-encoded dCTPase gp56 converts cellular dCDP and dCTP to dCMP. Hydroxymethylase gp42 converts dCMP to 5-hydroxymethyl-deoxycytidine-monophosphate (5- hmCMP), which is further phosphorylated to 5-hmdCTP by gp1 and incorporated in de novo synthesized DNA (Fig 2C). Post-replicative glucosylation by β-glucosyltransferase (β-gt) results in 5-ghmC.
T4 β-glucosyltransferase has recently been applied in the quantification of 5-hmC in eukaryotic DNA (Li et al., 2012; Nifker et al., 2015; Shahal et al., 2016). In these methods, 5-hmC containing DNA is converted to 6-azide- β-glucosyl-5-hydroxymethylcytosine (N3-5-ghmC) using uridine-diphosphate-6-azide-glucose (UDP-6-N3-Gluc) after which a fluorescent label is linked to the azide group using click chemistry forming fluorescently labeled DNA. This possibility to couple functional groups to 5-hmC containing DNA inspired us to transfer the 5-hmC synthesis pathway of T4 to E. coli.
To exploit the ability to couple click-chemistry compatible functional groups to large DNA molecules such as plasmids, we engineered the DNA modification pathway of bacteriophage T4 into E. coli by introducing four T4 genes.
Results
Expression of 5-ghmC synthesis operon in E. coli
To engineer the 5-ghmC synthesis pathway in E. coli, we constructed a plasmid (pGhmC) carrying the following genes from bacteriophage T4: gp42, gp1, gp56, and β-gt, encoding a dCMP hydroxymethylase, dNMP kinase, dCTPase, and β-glucosyltransferase, respectively. Expression of genes gp42, gp1, and β-gt was controlled by a rhamnose-inducible promoter. To control potential cytotoxicity of expression of dCTPase and uncouple it from the expression of gp42, gp1 and β-gt, we reversed the orientation of gp56, and transcriptionally isolated the gene between two rho-independent transcriptional terminators (Fig 2A). Expression of the four T4 genes encoded on pGhmC substitutes the production of cytosine by the production of 5-ghmC (Fig 2B & 2C). E. coli cells were transformed with pGhmC and plasmid DNA was isolated and digested with enzymes that are either sensitive to or selective for 5-hmC or 5-ghmC DNA (Figure ). AbaSI is a DNA modification-dependent endonuclease that recognizes 5-ghmC in double strand DNA. Additionally, AbaSI also recognized 5-hmC DNA, but at a lower efficiency than 5-ghmC. When the pGhmC plasmid was incubated with AbaSI, we observed almost complete DNA degradation, demonstrating the presence of 5-ghmC. McrBC is an endonuclease that
cleaves 5-hmC containing DNA. When the pHmC plasmid was incubated with McrBC, we observed DNA degradation, demonstrating the presence of 5-hmC.
Figure 2. Metabolic pathway for DNA synthesis in E. coli. (A) Four T4 genes encoded on plasmid pGhmC are responsible for the synthesis of 5-ghmC DNA. (B) DNA synthesis pathway in E. coli. (C). DNA synthesis in E. coli after introduction of the 5-ghmC synthesis pathway. Three T4 genes are required for the conversion of dCDP and dCTP to 5-hydroxymethyl-2’-cytidine. T4 β-glucosyltransferase glucosylates 5-hmC in a post-replicative process.
Figure 3. Modification-dependent degradation of plasmid DNA. Degradation of pGhmC DNA by AbaSI demonstrates the presence of 5-ghmC. Degradation of pHmC DNA by McrBC demonstrates the presence of 5- hmC. Unmodified DNA of pGG0 is degraded by neither AbaSI nor McrBC.
To validate that all four genes were essential for the production of 5-ghmC, we constructed plasmids lacking either gene gp42, gp1, gp56, or β-gt. Restriction digest analysis indeed showed that all four genes are required for efficient production of 5-ghmC (Fig S1). As expected, the deletion of gene β-gt results in 5-hmC DNA. The deletion of genes gp1, gp42, and gp56 results in cytosine containing DNA, showing that genes are essential.. We observed that when cells containing pGhmC were cultured for a period longer than one day, isolated plasmid DNA could not be cleaved by AbaSI, suggesting instability of the expression of the construct. To further investigate the toxicity of the expression of the 5-ghmC synthesis operon we performed growth experiments in which we monitored the optical density of the cultures. Cells containing pGhmC or pHmC showed a slower growth rate in the exponential phase and lower cell density in the stationary phase compared with cells containing the control plasmid pMK0 (Fig S2).
To attempt to further increase the substitution levels of cytosine by 5-hmC or 5-ghmC by increasing the dCMP pool we removed the transcriptional terminator upstream the gp56 dCTPase gene. Surprisingly, isolated plasmid DNA showed no degradation by AbaSI, indicating the absence of 5-ghmC in the DNA. This suggests that the expression levels of dCTPase are too high after removal of the transcriptional terminator, leading to cytotoxicity.
Determination of substitution levels
To determine the substitution levels of cytosine by 5-hmC or 5-ghmC, we analyzed the plasmid DNA using HPLC- UV detection. Plasmids pMK0, pHmC, and pGhmC were digested and analyzed using high-performance liquid chromatography ultraviolet (HPLC-UV). 5-hmdC levels were calculated by comparing the peak areas of plasmid digests with a 5-hmdC standard. pHmC DNA contains a cytosine substitution level of approximately 13-15% 5- hmdC under the tested circumstances (Figure 47). Enzymatically digested DNA was further analyzed by liquid chromatography mass spectrometry (LC-MS) (Fig S4). Using this method we were able to confirm the presence of 5-ghmdC by identification of the [M +H]+ ions of 5-ghmdC (m/z 420.2) (Figure ). Identification of [M +H]+
ions of 5-ghmC (m/z 304.2) is attributed to the elimination of a 2-deoxyribose moiety from 5-ghmdC caused by cleavage of the N-glycosidic linkage.
Figure 47. Quantification of 5-hmC levels of nucleosides from digested plasmid DNA. (A) HLPC analysis of nucleoside standards. (B) HPLC analysis of digested pHmC DNA.
6
Figure 3. Modification-dependent degradation of plasmid DNA. Degradation of pGhmC DNA by AbaSI demonstrates the presence of 5-ghmC. Degradation of pHmC DNA by McrBC demonstrates the presence of 5- hmC. Unmodified DNA of pGG0 is degraded by neither AbaSI nor McrBC.
To validate that all four genes were essential for the production of 5-ghmC, we constructed plasmids lacking either gene gp42, gp1, gp56, or β-gt. Restriction digest analysis indeed showed that all four genes are required for efficient production of 5-ghmC (Fig S1). As expected, the deletion of gene β-gt results in 5-hmC DNA. The deletion of genes gp1, gp42, and gp56 results in cytosine containing DNA, showing that genes are essential.. We observed that when cells containing pGhmC were cultured for a period longer than one day, isolated plasmid DNA could not be cleaved by AbaSI, suggesting instability of the expression of the construct. To further investigate the toxicity of the expression of the 5-ghmC synthesis operon we performed growth experiments in which we monitored the optical density of the cultures. Cells containing pGhmC or pHmC showed a slower growth rate in the exponential phase and lower cell density in the stationary phase compared with cells containing the control plasmid pMK0 (Fig S2).
To attempt to further increase the substitution levels of cytosine by 5-hmC or 5-ghmC by increasing the dCMP pool we removed the transcriptional terminator upstream the gp56 dCTPase gene. Surprisingly, isolated plasmid DNA showed no degradation by AbaSI, indicating the absence of 5-ghmC in the DNA. This suggests that the expression levels of dCTPase are too high after removal of the transcriptional terminator, leading to cytotoxicity.
Determination of substitution levels
To determine the substitution levels of cytosine by 5-hmC or 5-ghmC, we analyzed the plasmid DNA using HPLC- UV detection. Plasmids pMK0, pHmC, and pGhmC were digested and analyzed using high-performance liquid chromatography ultraviolet (HPLC-UV). 5-hmdC levels were calculated by comparing the peak areas of plasmid digests with a 5-hmdC standard. pHmC DNA contains a cytosine substitution level of approximately 13-15% 5- hmdC under the tested circumstances (Figure 47). Enzymatically digested DNA was further analyzed by liquid chromatography mass spectrometry (LC-MS) (Fig S4). Using this method we were able to confirm the presence of 5-ghmdC by identification of the [M +H]+ ions of 5-ghmdC (m/z 420.2) (Figure ). Identification of [M +H]+ ions of 5-ghmC (m/z 304.2) is attributed to the elimination of a 2-deoxyribose moiety from 5-ghmdC caused by cleavage of the N-glycosidic linkage.
Figure 47. Quantification of 5-hmC levels of nucleosides from digested plasmid DNA. (A) HLPC analysis of nucleoside standards. (B) HPLC analysis of digested pHmC DNA.
Figure 5. LC-MS analysis of 5-ghmC levels of nucleosides from digested 5-ghmC DNA. Presence of 5-ghmC in plasmid pGhmC was confirmed by identification of the [M +H]+ ions of 5-ghmdC (m/z 420.2). Identification of
[M +H]+ ions of 5-ghmC (m/z 304.2) is attributed to the elimination of a 2-deoxyribose moiety from 5-ghmdC caused by cleavage of the N-glycosidic linkage.
Coupling of functional groups to 5-hmC containing plasmid DNA
To label plasmid DNA with a fluorophore, selective tagging by copper-free click chemistry was used (Baskin et al., 2007). T4 β-glucosyltransferase was used to catalyze the transfer of an azido-sugar from chemically synthesized UDP-6-deoxy-6-azido-glucose (6-N3-UDPG) to the allylic hydroxyl group of 5-hmC. The plasmids
were then fluorescently labeled with dibenzocyclooctyne-Cy5 (DCBO-Cy5) and Cy5 fluorescence emission at 670 nm was used to assess the degree of labelling. Based on the relative fluorescence units (RFU) of the spectrum between 590 and 720 nm, we concluded that on average there are 4-5 fluorophores per plasmid molecule (Fig S3). To assess the functionality of the labeling, we immobilized the plasmid DNA on glass slides and imaged the samples using TIRF microscopy. By depositing samples of purified and washed pHmC-Cy5 on glass slides, single fluorescent spots could be visualized. By analysing the point-spread function, information about the number of fluorophores and their intensities could be compared to pMK0-Cy5 and pGhmC-Cy5 negative control plasmids (Fig 6). The pHmC-Cy5 sample contained approximately 2.1*102 fluorescent
molecules per mm2, which is approximately 20 times higher than either the pMK0-Cy5and approximately 9
times higher than pGhmC-Cy5 samples. This demonstrates successful labelling of the 5-hmC-containing plasmid
DNA (Fig 6). The low number of localizations when imaging pGhmC-Cy5 plasmids suggests that almost all 5- hmC nucleobases are glucosylated in vivo by β-glucosyltransferase.
Figure 6. Fluorescence microscopy imaging of plasmids labeled with Cy5. Labeling of plasmid DNA with Cy5 is specific to pHmC.
Single plasmid photobleaching analysis
Illumination of Cy5 with the excitation laser of 642 nm causes photobleaching and switching to the non- fluorescent OFF-state of the fluorophores. Time traces of individual fluorescent spots show the step-wise decrease or increase of fluorescence intensity (Fig 7). The number of steps corresponds to the number of fluorophores labeled in the plasmid. To confirm that the image signal originates from Cy5, we used recovery to the fluorescent ON-state using a 405 nm laser. Fluorescence of the spots was recovered indicating the presence of Cy5.