Faculty of Sciences
Department of Molecular Biology
Structure-function analysis of human PrimPol
Patricia A. Calvo Fernández
Madrid, 2019
Faculty of Sciences
Department of Molecular Biology
Structure-function analysis of human PrimPol
Patricia A. Calvo Fernández BSc, Biology
Supervisors: Prof. Luis Blanco Dávila and Dra. María Isabel Martínez Jiménez
Centro de Biología Molecular “Severo Ochoa” (CBMSO) Madrid, 2019
The work presented in this Doctoral Thesis has been carried out at the Centro de Biología Molecular “Severo Ochoa” (CBMSO) under the direction and supervision of Prof. Luis Blanco Dávila and Dra. María Isabel Martínez Jiménez. This Thesis was supported by a “Formación de personal investigador” (FPI) fellowship (Ministerio de Economía y Competitividad).
Madrid, 2019
A mi amiga Marina A mi prima Ainoa
A mis hermanos Lorena y Adrián A mis padres Toñy y Andrés
“Sorprenderse y maravillarse es comenzar a entender”.
José Ortega y Gasset
“Todo aquello que el hombre ignora, no existe para él. Por eso el universo de cada uno se resume al tamaño de su saber”.
Albert Einstein
“No debemos tener miedo a equivocarnos, hasta los planetas chocan y del caos nacen estrellas”
Charles Chaplin
Después de tantos años inmersa en uno de los viajes más enriquecedores y transformadores de mi vida, me gustaría agradecer a todas las personas que me han aportado su valiosa compañía a lo largo de esta gran aventura que ha sido el desarrollo de mi tesis doctoral.
En primer lugar, quisiera agradecer de corazón al Profesor Luis Blanco, el gran maestro que me rindió la oportunidad de adentrarme en el mundo de la investigación científica dirigiendo mi tesis doctoral. Gracias Luis por abrirme las puertas de tu laboratorio y mostrarme la ciencia con toda la pasión y devoción con las que un gran investigador como tú vive su vocación por el conocimiento.
Quisiera agradecer inmensamente también a la Doctora María I. Martínez, su codirección tan esencial durante todo el desarrollo de mi tesis. Gracias María no sólo por tus valiosos consejos y aliento como gran investigadora desde el primer momento si no también por tu amistad sincera, que tanto me ha aportado y enriquecido como persona.
Agradecer de nuevo y conjuntamente a mis dos directores de tesis la confianza plena que depositan en mi valor como investigadora y en mi capacidad para enfrentar todo tipo de retos. Vuestro respeto, comprensión y apoyo incondicional han sido clave para superar y finalizar este gran trabajo, fruto de muchos años de esfuerzo y muy preciados aprendizajes.
Me gustaría agradecer a todos los compañeros del laboratorio 403 actuales y antiguos que me han acompañado en las diferentes etapas de esta aventura. Gracias Sandra, Estefanía, Rubén y Estrella por vuestra ayuda desde la experiencia. Gracias Sara, Ana Aza, Ana Gómez y Guillermo por mostrarme el camino de la tesis. Gracias Alberto, Nieves, Antonio por vuestro apoyo y Cristina y Ana por vuestros ánimos en esta última fase tan tensa de la tesis. Gracias Gustavo y Fabiana por vuestra empatía y cercanía como compañeros y cómplices en este viaje y en especial gracias Susana por creer en mí, por acompañarme en los buenos momentos y alentarme en los más complicados.
Además quiero agradecer al Doctor Miguel de Vega por darme la oportunidad de continuar y avanzar en mi carrera investigadora con este nuevo proyecto.
Agradecer también a Margarita Salas y las personas de su laboratorio Modesto, Alicia, Carlos, Ana, Magdalena y Pablo por acogerme en su cercanía y hacerme sentir muy cómoda desde le primer momento.
Por otro lado agradecer al doctor Sjoerd Wanrooij y a los miembros de su laboratorio por recibirme y brindarme la oportunidad de redescubrirme como científica y como persona durante mi estancia en una ciudad tan mágica como es Umea, Suecia.
Quisiera agradecer a mis amigas del voley, en especial a Johana, Patricia, Naiara y Miriam, por motivarme, animarme y darme fuerzas en tantas ocasiones,
Me gustaría agradecer a mis amigas de la infancia Bea, Laura y Elena por acompañarme en este, igual que en el resto de capítulos de mi vida, estando a mi lado y siendo ellas mismas ejemplos de superación y constancia. En especial siento un profundo agradecimiento hacia Marina, mi fiel cómplice de aventuras. Gracias Marina por ser parte vital de este gran logro en mi vida; siento pura admiración por ti, sin tu cariño y amistad no hubiera podido acabar este periodo tan renovada y llena de valentía.
Gracias a mis amigos y especialmente a Johny, Samir, Íker y Hugo que han sabido apoyarme, acompañarme y animarme con cariño durante todo este periodo de la tesis.
Quisiera agradecer de todo corazón a Félix, compañero crucial de esta gran aventura. Gracias por creer en mí, apoyarme y motivarme en todo momento con tanto amor y respeto. Gracias por enseñarme a tener paciencia principalmente conmigo misma; has sido una pieza clave y un pilar fundamental en este capítulo de mi vida.
Agradecer a toda mi familia, a mi hermanos Lorena y Adrián, a mi prima Ainoa, a mi madrina Maribel, a mi yaya y a mi abuelo por ser cada uno ejemplos de lucha, sacrificio y superación a la vez que una fuente inagotable de amor, cobijo y comprensión en todas las fases mi vida. Gracias a vosotros he aprendido a aceptar con resiliencia las batallas más duras que se nos han ido presentando.
Por último, me siento eternamente agradecida a mis padres Toñy y Andrés. No hay manera de expresar la inmensa gratitud, admiración y amor que siento hacia vosotros. Gracias por ser mis guías, mi luz en los momentos más oscuros y mi motor principal en cada aventura de mi vida. Gracias por enseñarme a enfrentar mis miedos y a levantarme con tanta compasión cada vez que tropiezo, explicándome que todas las experiencias son aprendizajes. Gracias por educarme dejando siempre claro que con respeto, determinación y valentía no existen sueños inalcanzables. Gracias por mostrarme que mi vida es más grande que mis miedos y que mis fuerzas son mayores que mis dudas. Sin vosotros nada de esto hubiera sido posible.
¡Gracias a todos por ofrecerme cada día algo de lo que aprender y mucho que agradecer!
RESUMEN ... 19
SUMMARY ... 23
ABBREVIATIONS ... 27
INTRODUCTION ... 31
1. DNA Replication. The initiation problem ... 33
2. Primases ... 34
2.1. Prokaryotic primases (DnaGs) ... 35
2.2. Archaeo-eukaryotic primases (AEPs) ... 36
2.2.1. AEP active site ... 38
2.2.2. AEP Zn-Finger domain ... 39
2.2.3. Classification of the AEP superfamily ... 39
2.2.3.1. The AEP proper clade ... 40
2.2.3.2. PrimPol-like clade ... 40
2.2.3.3. The NCLDV-herpesvirus primase clade. ... 41
2.2.4. Specific properties of PrimPols ... 43
3. Primer synthesis ... 44
3.1. A two-metal ion mechanism for the phosphoryl transfer reaction is common to DNA polymerases and primases ... 34
4. Limited similarities between primases and DNA polymerases ... 35
5. Human PrimPol ... 36
5.1. Active site and Zinc-finger domain ... 37
5.2. Cellular context ... 39
5.3. Mechanism of action ... 39
OBJECTIVES ... 53
MATERIALS AND METHODS ... 57
1. Expression and purification of PrimPol, site-directed mutants and ΔZnFD mutant ... 49
2. Oligonucleotides and nucleotides ... 49
3. Primary sequence alignments and 3D-visualization ... 49
4. EMSA for enzyme:ssDNA binary complex ... 49
5. EMSA for enzyme:ssDNA:dNTP preternary complex ... 50
6. DNA primase assay ... 50
7. DNA polymerization assays ... 52
RESULTS ... 65
1. CHAPTER 1: Sequential steps of the human PrimPol primase activity and the role of the ZnFD at each step of the process ... 67
1.1. PrimPol interaction with ssDNA does not require the Zn-finger domain and activating metal ions ... 58
1.2. PrimPol forms a stable pre-ternary complex in the presence of Mn ions,
which does not involve the Zn-finger domain ... 59
1.3. The cryptic G in the template recognition sequence does not determine formation of binary and pre-ternary complexes ... 61
1.4. The cryptic G-mediated stimulation of dimer synthesis and elongation requires the Zn-finger domain ... 63
1.5. The Zn-finger domain is not required to elongate synthetic primers of minimal size ... 64
1.6. The Zn-finger domain is required to use a triphosphate as the preferred initiating 5´-nucleotide ... 66
1.7. A 5´-terminal triphosphate facilitates primer elongation ... 68
2. CHAPTER 2: Biochemical study of the human PrimPol metal ligands ... 83
2.1. Metal ligands of eukaryotic PrimPols involved in polymerization ... 74
2.2. Human PrimPol Glu116 enhances Mn2+-dependent dislocation reactions ... 74
2.3. Human PrimPol Glu116 is required for efficient primer synthesis ... 76
2.4. Human PrimPol metal ligands are irrelevant for template binding ... 77
2.5. The three metal ligands of human PrimPol are required to form a stable enzyme:ssDNA:dNTP pre-ternary complex. A change of Glu116 to aspartate has a negative impact on this step ... 78
2.6. Human PrimPol Glu116 is required for priming in the presence of two different metal ions ... 79
3. CHAPTER 3: Characterization of human PrimPol WFYY conserved motif ... 93
3.1. Trp87 and Tyr90 residues of human PrimPol WFYY motif are essential for its DNA primase activity ... 84
3.2. Trp87 and Tyr90 residues of human PrimPol WFYY motif are crucial for a stable pre-ternary complex formation ... 87
3.3 Trp87 and Tyr90 residues of human PrimPol WFYY motif are essential for its DNA polymerase activity ... 88
3.4 The human polymorphic variant F88L is defective in alternative TLS reactions on UV-damaged DNA ... 90
DISCUSSION ... 107
1. The Zn-finger domain of human PrimPol is required to stabilize the initiating nucleotide during DNA priming ... 109
1.1. Sequential steps during primer synthesis by human PrimPol ... 109
1.2. Is the role of the ZnFD unique for PrimPols? ... 112
2. The invariant glutamate of human PrimPol DxE motif is critical for its Mn2+- dependent distinctive activities ... 114
2.1 How the catalytic human PrimPol carboxylates interact with the metal ions? ... 114
2.2 Which metal ion is coordinated by the invariant glutamate of human PrimPol DxE motif? ... 116
2.3 How Mn2+ regulates PrimPol distinctive activities? ... 116
3. Motif WFYY of human PrimPol is crucial to stabilize the incoming 3´-nucleotide
during replication fork restart ... 117
3.1. Structural basis for the defective 3´nucleotide binding associated to mutations at the WFYY motif ... 119
3.2. Mutation Y89D in human PrimPol is irrelevant, and not supportive of a correlation with High Myopia ... 122
4. Concluding remarks ... 122
CONCLUSIONS ... 125
CONCLUSIONES ... 129
BIBLIOGRAPHY ... 133
APPENDIX ... 143
1. Publications included in this thesis results ... 145
2. Other publications ... 189
PrimPol es la segunda primasa en células humanas, la primera con la capacidad de iniciar la síntesis de cadenas de DNA con dNTP. PrimPol contribuye a la tolerancia al daño del DNA reiniciando su síntesis más allá de la lesión, actuando como una TLS primasa, saltando lesiones ilegibles como sitios abásicos, o "leyendo" ciertas lesiones como 8-oxo-dG.
PrimPol contiene un core de Primasas Arqueo-Eucariotas (AEP) seguido de un dominio que contiene un dedo de Zinc (ZnFD) en la región C-terminal, que es requerido exclusivamente para la formación de iniciadores y para la función PrimPol in vivo. La presente tesis describe la secuencia de interacciones de la PrimPol humana durante la síntesis de iniciadores y la relevancia del ZnFD en cada paso. Tanto la formación del complejo binario PrimPol:ssDNA como la subsiguiente interacción con el nucleótido 3´ (complejo pre-ternario) permanecen intactas cuando PrimPol carece del ZnFD. Por el contrario, se necesitó el ZnFD para la posterior unión y selección del nucleótido 5’, probablemente porque interactúa con su fosfato γ.
Además, demostramos que el ZnFD contribuye al reconocimiento de la G críptica en la secuencia de iniciación preferida 3´GTC5´, y que a su vez también es esencial en los sucesivos eventos de translocación / elongación durante la síntesis del iniciador de DNA.
Mediante mutagénesis dirigida de la PrimPol humana, confirmamos la relevancia catalítica de los ligandos del metal Asp114, Glu116 y Asp280, además identificamos el Glu116 como un potenciador relevante de las reacciones distintivas de PrimPol, las cuales son altamente dependientes de Mn2+. Por otra parte, evidenciamos que el Glu116 es importante para los eventos mediados por dislocaciones de iniciador/molde, y crucial para lograr una actividad primasa óptima, procesos en los cuales el Mn2+ es altamente preferido. El análisis del complejo pre-ternario reveló el papel crítico de cada ligando del metal y un deterioro significativo cuando se cambió el Glu116 a un aspartato convencional. Estos datos sugieren que el sitio activo de PrimPol requiere del motivo A específico (DxE) para favorecer el uso de Mn2+ esencial para la óptima estabilización del nucleótido entrante, especialmente durante la síntesis de iniciadores.
Alineamientos de secuencias de aminoácidos de PrimPoles eucarióticas nos permitió identificar un motivo WxxY, altamente conservado, localizado muy cerca del motivo invariante A. Realizamos mutagénesis de los aminoácidos Trp87, Phe88, Tyr89, Tyr90 que forman el motivo WFYY en la PrimPol humana. Mostramos que una mutación específica en la Tyr89 (Y89D) del motivo WFYY de la PrimPol humana (presuntamente relacionada con la Miopía Magna) resultó ser irrelevante para las funciones enzimáticas in vitro de PrimPol. Por otro lado, la variante polimórfica humana F88L se vio afectada en las actividades primasa/polimerasa, especialmente durante las reacciones de TLS. Finalmente los residuos invariantes de Trp87 y Tyr90 mostraron ser cruciales para las actividades primasa y polimerasa de PrimPol, debido a su papel indirecto en la estabilización del nucleótido entrante en posición 3’.
PrimPol is the second primase in human cells, the first with the ability to start DNA chains with dNTPs. PrimPol contributes to DNA damage tolerance by restarting DNA synthesis beyond the lesion acting as a TLS primase, skipping unreadable lesions as abasic sites, or
"reading" certain lesions as 8-oxo-dG.
PrimPol contains an Archeal-Eukaryotic Primases (AEP) core followed by a C-terminal Zn finger-containing domain (ZnFD) that was exclusively required for primer formation and for PrimPol function in vivo. The present thesis describes the sequential substrate interactions of human PrimPol during primer synthesis, and the relevance of the ZnFD at each individual step.
Both the formation of a PrimPol:ssDNA binary complex and the upcoming interaction with the 3´-nucleotide (pre-ternary complex) remained intact when lacking the ZnFD. Conversely, the ZnFD was required for the subsequent binding and selection of the 5´-nucleotide most likely interacting with its γ-phosphate moiety. We demonstrate that the ZnFD contributes to recognize the cryptic G at the preferred priming sequence 3´GTC5´ being also essential in the subsequent translocation/elongation events during DNA primer synthesis.
By site-directed mutagenesis in human PrimPol, we confirmed the catalytic relevance of the metal ligands Asp114, Glu116 and Asp280, and identified Glu116 as a relevant enhancer of distinctive PrimPol reactions, which are highly dependent on Mn2+. Herein, we evidenced that Glu116 was important for events mediated by primer/template dislocations, and crucial to achieving an optimal primase activity, processes in which Mn2+ is largely preferred. Pre-ternary complex analysis indicated a critical role of each metal ligand, and a significant impairment when Glu116 was changed to a conventional aspartate. These data suggest that PrimPol active site requires a specific motif A (DxE) to favor the use of Mn2+ required for optimal incoming nucleotide stabilization, especially required during primer synthesis.
Amino acid sequence alignment of eukaryotic PrimPols allowed us to identify a highly conserved motif, WxxY, in close proximity to the invariant motif A. We performed mutagenesis of Trp87, Phe88, Tyr89, Tyr90 forming the WFYY motif in human PrimPol. We showed that a specific mutation at Tyr89 of WFYY human PrimPol motif (claimed to be relate to High Myopia, Y89D) was irrelevant for the in vitro enzymatic functions of PrimPol. The human polymorphic variant F88L was affected in primase/polymerase activities especially during TLS reactions whereas the invariant Trp87 and Tyr90 residues are crucial for both primase and polymerase activities of PrimPol, mainly due to their indirect role in in stabilizing the incoming 3´-nucleotide.
6-4PPs pyrimidine(6-4)pyrimidone photoproducts 8oxodG 8-oxo-2'-deoxyguanosine
AEP Archeal-Eukaryotic Primases AP Apurinic/apyrimidinic Site
BSA Bovine Serum Albumin
DSBs Double-Strand Breaks
DTT Dithiothreitol
EDTA Ethylenediaminetetraacetic acid EMSA Electrophoretic Mobility Shift Assay NCLDV Large Nucleo-Cytoplasmic DNA Viruses NHEJ Nonhomologous end-joining
nt nucleotide
p phosphate
RPA Replication Protein A RRM RNA Recognition Motif ssDNA Single-stranded DNA TLS Translesion synthesis TOPRIM Topoisomerase-primase ZnFD Zn finger-containing domain
1. DNA Replication. The initiation problem
The general process of DNA replication, which ensures the transmission of genetic information through generations, is conserved in all living organisms, but mechanistic details, molecular partners and interactions have diverged between bacteria and eukaryotes.
Moreover, the replication machinery from archaea shares properties with prokaryotes and eukaryotes and is extremely homologous to that of higher eukaryotes (Liu et al., 2001).
In all cells, DNA replication involves a large number of coordinated events that are carried out by a wide variety of proteins. The process usually begins with the binding of origin recognition proteins to specific initiation sites that produces their local opening, and allows the helicase loading. Once the initiation machinery is assembled at the replication origin, separation of the parental DNA anti-parallel strands by a hexameric replicative DNA helicase establishes the directionality of the replication fork. The resulting single-stranded DNA (ssDNA) is protected by single-stranded DNA-binding proteins (SSB) and acts as a template where a primase synthesize a short RNA primer that is subsequently extended by a DNA polymerase (Figure 1). Topoisomerase stabilizes the region directly ahead of the replication fork by breaking the strands, turning them, and rejoining them to relieve the torsional (twisting) strain created by the unwinding of the double helix (Figure 1).
Figure 1. Prokaryotic DNA replication. Schematic representation of the prokaryotic replisome complex and its associated proteins. Synthesis of the leading and lagging strands of DNA involves coordination of multiple proteins at the replication fork. Adapted from (McHenry, 2003).
Most DNA polymerases are unable to initiate DNA synthesis, as they cannot use a dNTP as the initiating/priming source of the hydroxyl group required to transfer elongating nucleotides during DNA replication (Kornberg & Baker 1992; Lodish et al., 1999). Nature has developed different solutions to solve this requirement:
· Retroelements use an endogenous cellular tRNA as a primer, which provides a free 3'OH group to be elongated by the reversotranscriptase (Lodish et al., 1999).
· In adenoviruses and Φ29 family of bacteriophages, the free 3’OH group is provided by an amino acid of the terminal protein onto which the DNA polymerase can polymerize dNTPs (Salas, 1991).
· In several families of DNA viruses, some phages and plasmids that replicate by a rolling circle mechanism, an endonuclease makes a nick in one of the strands, the 5 'end is transferred to a tyrosine in the nuclease and the 3'OH group is elongated by the DNA polymerase (Koonin and Ilyina, 1992, Noirot-Gros and Ehrlich, 1996).
. The mitochondrial RNA polymerase synthesizes abortive transcripts, which are subsequently elongated by Polγ (Wanrooij et al., 2008).
· Finally, the most common mechanism to resolve this issue relies on a specialized RNA polymerase, a primase, which is able to initiate synthesis from a single ribonucleotide triphosphate (NTP) to produce short RNA primers whose 3’OH is elongated by DNA polymerases (Kornberg & Baker 1992; Frick & Richardson 2001; Kuchta & Stengel 2010).
2. Primases
Primases play a crucial role at the replication fork as they are the only enzymes capable of initiating synthesis de novo on ssDNA. Primers are synthesized from ribonucleotide triphosphates and are four to fifteen nucleotides long. All primases share many properties, but they are classified into two main families that differ in their structure as well as in their interactions with other proteins of the replication complex.
Based on their structure, primases can be divided into two groups: the DnaG superfamily with a TOPRIM (topoisomerase-primase) fold domain (Aravind et al., 1998), and the AEP superfamily, which contains an RRM (RNA recognition motif) fold domain (Augustin et al., 2001, Aravind et al., 2002). The first class contains bacterial and bacteriophage primases.
The second major class comprises heterodimeric archaeo and eukaryotic primases. This dichotomy between primases from bacterial and archaeo-eukaryotic lineages mirrors other DNA synthesis enzymes as polymerases and helicases; notwithstanding, other components of the replication such as DNA ligases, topoisomerase IA and RNAse HII are homologous between bacteria and archeo-eukaryotes (Leipe et al., 1999, Forterre, 1999).
2.1. Prokaryotic primases (DnaGs)
In all bacteria and phages, the primase responsible for DNA replication belongs to the DnaG superfamily whose catalytic domain presents a TOPRIM fold that contains an α / β "core"
with four sheets of Rossmann-type topology (Aravind et al., 1998). Primases from bacteria and phages are derived from a common ancestor, have similar functional characteristics and a close association with a DNA helicase (Ilyina and Koonin, 1992). These primases are composed of three main regions: an amino terminal domain with a Zn-ribbon motif involved in DNA binding, a middle RNA polymerase domain and a carboxyl-terminal region that either is a DNA helicase itself or interacts with a DNA helicase (Figure 2A, B) (Washington et al., 1996).
The central TOPRIM domain constitutes the core of the primase activity of prokaryotic primases. The domain consists of ~100 amino acids and within this region there are five conserved motifs, 2-6 (Figure 2A). Motif 3 is positively charged and similar to a motif involved in RNA synthesis present in several RNA polymerase large subunits of both prokaryotic and eukaryotes (Mustaev and Godson, 1995). Motif 4 possesses a conserved glutamate that acts as a general base in nucleotide polymerization, while domain 5 contains the DxD conserved motif that coordinates Mg2+ cations required for the activity of all TOPRIM-containing enzymes (Aravind et al., 1998). Additionally, motif 6, also involved in metal binding coordination necessary for catalysis, usually contains a DxD motif (in bacterial DnaG) or only a D motif (in the phage primases) (Figure 2A).
C! C!C! C!
Zinc Binding! RNA Polymerase! Helicase!
1! 2! 3! 4! 5! 6!
K!
P!
R! D!
EGxxD!DxDxxGxxA!
TOPRIM!
A!
B!
Zn! Cys!
Mg!
Mg!
Asp!
Figure 2. Structure of a prokaryotic DnaG-like DNA primase. (A) Schematic diagram of the domain structure and the conserved residues in DnaG-like primase. The six consensus sequences are listed and in gray boxes. The TOPRIM domain is indicated with a square bracket. (B) A DnaG primase is exemplified with the structure of the gp4 protein of T7 bacteriophage colored as in part A. The zinc binding domain (violet) with the coordinating cysteines;
the primase-type DnaG domain (green) with the aspartate residues responsible of the activity and the two Mg2+
metal ions; the C-terminal end (gray) is the helicase domain. (N-terminal, PDB id:1NUI; C-terminal, PDB id: 1Q57).
Adapted from (Frick and Richardson, 2001).
All prokaryotic primases share the N-terminal Zn-finger and the RNA polymerase domain but differ in their C-terminal domain, as it is not evolutionarily conserved. In the bacterial primases and in certain phage primases, the only known function of the C-terminal domain is to interact with helicases or with other proteins at the replication fork.
2.2. Archaeo-eukaryotic primases (AEPs)
The AEP superfamily is highly heterogeneous as it encompasses conventional primases, PrimPols, and even RNA polymerases specialized in nonhomologous end-joining (NHEJ) of double-strand breaks (DSBs) (Frick and Richardson, 2001, Iyer et al., 2005, Lipps et al., 2003, Brissett et al., 2007, Pitcher et al., 2007b). In the archaeo-eukaryotic lineage, the primase structure has the characteristic fold of the RNA recognition motif (RRM), but comparative analyses of phyletic profiles suggest that AEPs are not represented in the last universal common ancestor (LUCA) (Augustin et al., 2001, Aravind et al., 2002). AEP primases are suggested to have been recruited at the base of the evolution of the archeo-eukaryotic lineage, with a subsequent acquisition by bacteria via horizontal gene transfer (Iyer et al., 2005).
Unlike monomeric bacterial primases, eukaryotic primases are heterodimers of catalytic (p49) and the regulatory (p58) subunit. These primase forms a heterotetrameric complex with DNA polymerase α and its B subunit, having an apparent mass of more than 300 kDa (Figure 3). The largest subunit, typically 165–180 kDa (DNA polymerase alpha, Polα), contains the active site for DNA synthesis whereas the smallest, typically about 49 kDa, contains the active site for oligoribonucleotide synthesis. The 70-kDa subunit (Polα B subunit) likely regulates polymerase activity whereas the other, 58-kDa subunit, is suggested to assist the catalytic primase subunit. The small primase subunit (p49) which contains the active site for primer synthesis, it is often referred as Pri1. The large primase subunit (p58), which may coordinate primase and polymerase activities, is named Pri2 (Figure 3) (Frick and Richardson, 2001, Iyer et al., 2005, Baranovskiy et al., 2016).
Figure 3. Cartoon of the eukaryotic Polα/primase complex. The Polα DNA polymerase (red) and its B subunit (blue) form a complex with the large primase subunit Pri2 (yellow), and the small primase subunit Pri1 (purple). Reproduced from (Frick and Richardson, 2001).
Human primase (Pri1+Pri2) forms a complex with Polα (A and B subunits), and this tetramer synthesizes the primer which initiates synthesis of the leading strand at each replication origin, and the multiple primers required to synthesize the Okazaki fragments at the lagging strand during nuclear DNA replication (Arezi et al., 1999). The large essential subunit (p58) participates in primer synthesis, counts the number of nucleotides in a primer, assists the release of the primer-template from primase and transfers it to the Polα active site (Baranovskiy et al., 2016, Agarkar et al., 2011). After a slow dinucleotide formation step, the Polα-primase complex rapidly synthesizes RNA primers 7–10 nucleotides long. Extension of these RNA oligonucleotides into RNA–DNA primers of ~30 nucleotides by Polα is required before their transfer to the active site of the more processive DNA polymerases delta (Polδ) and epsilon (Polε) (Figure 4).
Figure 4. Eukaryotic DNA replication. Synthesis of the leading and lagging strands of DNA involves coordination of multiple proteins at the replication fork. Eukaryotic replisome complex and associated proteins are indicated.
Adapted from (Burgers and Kunkel, 2017).
Leading
Lagging Mcm2-7 Cdc45 Polε PCNA
C84
RPA Polα
primase
Polδ
PCNA
GINS
2.2.1. AEP active site
The catalytic subunit of AEPs contains two structural modules: the N-terminal module has no structural equivalent, while the C-terminal module (“core”) contains the RRM-fold domain also found in the palm domain of other polymerases. Primases and DNA polymerases often require three carboxylates within the active site to coordinate the two metal cofactors and form a metal bridge between the enzyme, the primer and the incoming nucleotide (Steitz, 1999, Yang et al., 2006). These metal ligands are generally aspartates and two of them are localized close together forming what is called a DxD motif (motif A) universally conserved in DNA polymerases, and also in primases (motif 5 in DnaG-type primases; see Figure 2A).
There are three main conserved motifs in the AEP superfamily at the “core” region:
hhhDhD (motif A), where “h” corresponds to a hydrophobic residue, and includes two of the catalytic aspartates; sxH (motif B) where “s” is a small residue; hD/E (motif C) where "D/E" is the putative third catalytic metal ligand (Lipps, 2004). The three motifs (A, B and C) are located in proximity, at the center of the RRM fold domain, supporting that they participate together in catalysis. Figure 5 shows the active centre conformation of some of AEPs as that from the Sulfolobus islandicus archaeal plasmid pRN1 (Figure 5 left panel) the bacterial NHEJ AEP Mycobacterium tuberculosis PolDom (Mt-PolDom), specialized in double strand breaks (DSBs) (Figure 5 central panel), and the catalytic subunit of an achaeal primase from Pyrococcus horikoshi (Figure 5 right panel).
Figure 5. Representative examples of the crystal active center structures of different AEPs. Ribbon representation of a close-up view of some AEPs active sites highlighting the catalytic residues (red) and their specific interactions with the nucleotide (yellow) and the metal cofactor (grey sphere). Left panel: AEP domain of ORF904 encoded by the Sulfolobus islandicus plasmid pRN1. Central panel: NHEJ repair polymerase (PolDom/LigD-Pol) from Mycobacterium tuberculosis. Right panel: primase small catalytic subunit (PriS/Prim1) from the archaea specie Pyrococcus horikoshi.
D111!
E113! D171!
H111!
T137!
Q173!
I169!
D140!
D142!
D230!
H181!
S232!
F228!
D95! D97!
D280!
H151! H298!
V283!
I144!
K V G!
300-301-302!
Si/pRN1! MtPolDom! PhoPriS!
2.2.2. AEP Zn-Finger domain
Eukaryotic primases are distinguished by structural elements outside the “core” region.
Most DNA primases contain a metal binding domain, which is composed of four conserved Cys or His residues that could potentially coordinate a zinc atom. The most notable structural difference between the eukaryotic and the archaeal catalytic subunits lies on the small helical domain, which contain the zinc-binding motif, not conserved in structure neither in sequence. In prokaryotic primases, this metal binding site is at the N-terminus of the polypeptide. In contrast, Zn binding sites are located in the C-terminus of primases from viral systems, and centrally in the catalytic (small) subunits of dimeric eukaryotic primases.
2.2.3. Classification of the AEP superfamily
Iyer and coworkers (Iyer et al., 2005) have described more than 10 AEP families, including a novel family termed PrimPol (primase-polymerase), which now we know that includes human PrimPol, the subject of this doctoral thesis. Based on amino acid sequence similarity, AEPs were classified into three higher-order clades, as described below (Figure 6).
Figure 6. Inferred evolutionary history of the AEP superfamily. The phyletic distribution is shown in brackets: B, Bacteria; A, Archaea; E, Eukaryotes; V, Viruses; > represents a proposed lateral transfer. The ellipses indicate large assemblages within which individual lineages show a generic relationship. Broken lines indicate an uncertainty with respect to the exact point of origin of a lineage. Archaeal and eukaryotic (including viral) branches (blue), bacterial branches (green), branches from plasmids, phages and mobile elements (red), ancestral branches and branches outside the AEP superfamily (black). Reproduced from (Iyer et al., 2005).
2.2.3.1. The AEP proper clade. This clade encompasses three families: the small subunits of archeal and eukaryotic primases (classical AEPs), the Lef-1 baculoviruses primases, and the bacterial NHEJ polymerases. Classical AEPs and Lef- 1 family form a functional complex with their respective large, non-catalytic subunit (Evans et al., 1997). On the other hand, bacterial AEPs specialized in NHEJ appear to be fused to a DNA ligase and a nuclease. This last family is a special case of primase- like enzymes lacking the ability to initiate de novo synthesis to be specialized in processing the ends of double strand DNA breaks, thus contributing to the prokaryotic NHEJ system (Pitcher et al., 2007a).
2.2.3.2. PrimPol-like clade. This clade includes seven distinct families of primases from bacteria, archea, viruses, bacteriophages and plasmids. Primases in this clade often have one of the following two domains: Primase-C Terminal-1 and Primase- C Terminal-2 (PriCT-1, PriCT-2).
The first described PrimPol was Orf904 from Sulfolobus islandicus archaeal plasmid pRN1, which is a highly compact multifunctional enzyme with ATPase, primase and DNA polymerase activity in its N-terminal region and a helicase activity in its C- terminal region (Lipps, 2004). This protein synthesizes a primer of approximately seven deoxyribonucleotides and possesses high sequence specificity for the initiation site (Beck and Lipps, 2007). The structure of the ORF904 PrimPol domain does not resemble any known DNA polymerase structure and its Zn-finger domain lies in between the N- and C-terminal domains, close to the active center (Lipps, 2004).
Another PrimPol family is typified by the primase-DNA polymerase domain of the crenarchaeal RepA-like protein with members found only in prokaryotes and their viruses. The ColE2 Rep-like family is found in plasmids that are mainly present in proteobacteria, a few actinobacteria and Thermus (Iyer et al., 2005). Some recently characterized members of the PrimPol-like clade include BcMCM from Bacillus cereus (Sanchez-Berrondo et al., 2012) and TthPrimPol from Thermus thermophilus (Picher et al., 2016).
2.2.3.3. The NCLDV-herpesvirus primase clade. This clade includes predicted primases present in large nucleo-cytoplasmic DNA viruses (NCLDV), kinetoplastids, and a new subfamily of eukaryotic primases termed EukPrim2. The common characteristic of this clade is the presence of a highly conserved C-terminal domain containing a Zn- finger motif, and two highly conserved residues, closed to the active site and possible implicated in substrate interaction: a glutamate, present in a "Exb" motif (b: big residue, mostly hydrophobic) and a lysine.
Based on the structural domain organization, this clade can be subdivided into two families: the iridovirus primase family and the herpes-poxvirus primase family. A third family of this clade includes a putative second human primase (as predicted by in silico analysis) initially named Eukprim2 (Iyer et al., 2005). This protein was later renamed PrimPol (the subject of this thesis), as it has both DNA primase and DNA polymerase activities as initially described in pRN1 PrimPol. A more extensive analyses of the primary sequence of human PrimPol indicated that it belongs to the archaeo- eukaryotic superfamily of primases (AEP), and served to define 14 highly conserved motifs conserved among the members of this new PrimPol family (Figure 7) (Garcia- Gomez et al., 2013). The Eukprim2/PrimPol proteins are widely distributed throughout the kingdoms of fungi, animals and plants. The gene of this new primase appears to be acquired early in the eukaryotic evolution and subsequently lost in several occasions in fungi and animal kingdoms, such Saccharomyces cerevisie, Caenorhabditis elegans and Drosophila melanogaster.
Figure 7. The Eukaryotic ccdc111/PrimPol Family. Multiple amino acid sequence alignment of ccdc111 orthologues. Numbers between slashes indicate the amino acid position relative to the N-terminus, and numbers in parentheses indicate the total number of amino acid residues. Invariant or highly conserved residues (red letters) or another identities conserved in most sequences (blue letters) are indicated. The alignment defines 14 conserved regions (boxed) among the ccdc111 family, including the highly conserved motifs A, B and C, characteristic of AEP like primases, and the Zn-finger motif, characteristic of some viral primases (Iyer et al., 2005) and other AEP-related enzymes. Dots at the top of the aligned sequences indicate invariant residues acting either as metal (red), nucleotide (purple) or Zn (blue) ligands. Adapted from (Garcia-Gomez et al., 2013).
Homo/24/' Pan'/24/' Macaca'/24/' Bos'/24/' Equus'/24/' Canis'/140/' Monodelphis'/113/' Ra=us'/24/' Mus'/24/' Taeneopygia'/24/' Danio'/25/' Strongylocentritus'/36/' Ciona'/42/' Monosiga'/15/' Ricicnus'/110/' ViHs'/129/' Arabidopsis'/114/' Zea'/111/' Oryza'/113/' Micromonas'/144/' Thalassiosira'/213/' Brugia'/68/' Ostreococcus'/93/' Cryptosporidium'/62/' Ixodes'/27/' Acyrtosiphon'/26/' Nasonia'/34/' Pediculus'/28/' Apis'/1/' Homo'/1/' Pan'/1/' Macaca'/1/' Bos'/1/' Equus'/1/' Canis'/1/' Monodelphis'/1/' Ra=us'/1/' Mus'/1/' Taeneopygia'/1/' Danio'/1/' Strongylocentritus'/1/' Ciona'/1/' Monosiga'/1/' Ricicnus'/1/' ViHs'/1/' Arabidopsis'/1/' Zea'/1/' Oryza'/1/' Micromonas'/1/' Thalassiosira'/1/' Brugia'/1/' Ostreococcus'/1/' Cryptosporidium'/1/' Ixodes'/1/' Acyrtosiphon'/1/' Nasonia'/1/' Pediculus'/1/'
Homo/189/' Pan'/189/' Macaca'/195/' Bos'/189/' Equus'/189/' Canis'/307/' Monodelphis'/278/' Ra=us'/189/' Mus'/189/' Taeneopygia'/191/' Danio'/188/' Strongylocentritus'/221/' Ciona'/202/' Monosiga'/179/' Ricicnus'/254/' ViHs'/275/' Arabidopsis'/258/' Zea'/255/' Oryza'/257/' Micromonas'/301/' Thalassiosira'/378/' Brugia'/210/' Ostreococcus'/251/' Cryptosporidium'/203/' Ixodes'/177/' Acyrtosiphon'/175/' Nasonia'/188/' Pediculus'/169/' Apis'/142/'
Homo/344/' Pan'/344/' Macaca'/337/' Bos'/334/' Equus'/335/' Canis'/452/' Monodelphis'/353/' Ra=us'/317/' Mus'/317/' Taeneopygia'/307/' Danio'/306/' Strongylocentritus'/379/' Ciona'/335/' Monosiga'/353/' Ricicnus'/368/' ViHs'/241/' Arabidopsis'/379/' Zea'/375/' Oryza'/376/' Micromonas'/432/' Thalassiosira'/544/' Brugia'/331/' Ostreococcus'/368/' Cryptosporidium'/376/' Ixodes'/284/' Acyrtosiphon'/293/' Nasonia'/304/' Pediculus'/259/' Apis'/211/'
Homo/516/' Pan'/516/' Macaca'/509/' Bos'/506/' Equus'/507/' Canis'/624/' Monodelphis'/594/' Ra=us'/489/' Mus'/489/' Taeneopygia'/479/' Danio'/478/' Strongylocentritus'/551/' Ciona'/507/' Monosiga'/525/' Ricicnus'/540/' ViHs'/413/' Arabidopsis'/551/' Zea'/547/' Oryza'/550/' Micromonas'/604/' Thalassiosira'/716/' Brugia'/503/' '' '' 'Pediculus'/431/'
(560''''aa)' (560'aa)' (562'aa)' (555'aa)' (558'aa)' (672'aa)' (656'aa)' (534'aa)' (537'aa)' (548'aa)' (523'aa)' (656'aa)' (519'aa)' (614'aa)' (601'aa)' (633'aa)' (618'aa)' (631'aa)' (612'aa)' (636'aa)' (761'aa)' (559'aa)' (488'aa)' (527'aa)' (436'aa)' (449'aa)' (452'aa)' (438'aa)' (361'aa)'
1! 2! 3! 4! 5! 6!
7!
MotifA! Motif B!
8! 9!
Motif C!
10! 11! 12! Zinc finger!13! 14!
2.2.4. Specific properties of PrimPols
Unlike conventional primases, PrimPols use dNTPs to build-up the primers they make, with the exception of the initiating nucleotide that is likely an NTP (Lipps et al., 2003).
Moreover, PrimPols can also use dNTPs to extend pre-existing DNA or RNA primers (Martinez- Jimenez et al., 2015), behaving like conventional DNA polymerases. Because of this dual function, this group of enzymes were originally named PrimPols (Lipps et al., 2003).
PrimPols are an archaic solution to the “priming problem”, defined by the fact that DNA polymerases, unlike RNA polymerases, cannot initiate DNA synthesis de novo. Human PrimPol was the first eukaryotic PrimPol described (Garcia-Gomez et al., 2013, Bianchi et al., 2013, Wan et al., 2013). PrimPols, which combines both primase and DNA polymerase activities, have been described in plasmids, bacteria, archea and eukaryotes (Lipps, 2004, Sanchez- Berrondo et al., 2012, Bocquier et al., 2001, Garcia-Gomez et al., 2013, Keen et al., 2014a, Bianchi et al., 2013), being originally classified in different AEP superfamily clades.
Although conventional AEP-like primases, as human Prim1, have the three conserved motifs (A, B and C) that form the primase active site, the PrimPols already characterized have the same three conserved motifs but also a Zn-finger domain required for the DNA primase activity (Garcia-Gomez et al., 2013, Mouron et al., 2013, Wan et al., 2013) or even a helicase domain as BcMCM PrimPol from Bacillus cereus and pRN1 PrimPol from Sulfolobus islandicus (Sanchez-Berrondo et al., 2012, Lipps, 2004). Strikingly, the putative Thermus thermophilus PrimPol contains the three motifs but lacks both Zn-finger and helicase domain (Figure 8)(Picher et al., 2016). Instead, TthPrimPol at its C-terminal domain contains an α- helical PriCT-1 domain, characteristic of some prokaryotic primases, also shared by BcMCM and Si/pRN1 PrimPols (Figure 8) (Iyer et al., 2005, Picher et al., 2016).
Figure 8. Modular organization of various AEP-like enzymes. A conserved AEP domain (green bar) contains the three conserved regions A, B and C forming the primase active site. Nomenclature: small catalytic subunit of the human RNA primase (HsPrim1); human PrimPol (HsPrimPol); PrimPol helicase from Bacillus cereus (BcMCM);
plasmid pRN1 ORF904 from Sulfolobus islandicus (Si/pRN1 PrimPol); putative PrimPol from Thermus thermophilus (TthPrimPol). Reproduced from (Picher et al., 2016)
3. Primer synthesis
Primases catalyze the synthesis of oligoribonucleotides in a minimum of five discrete steps: 1- template binding, 2- NTP binding, 3- slow dinucleotide formation, 4- rapid extension to a functional primer, and 5- direct primer transfer into the active site of a DNA polymerase (Figure 9). The molecular mechanism of primer synthesis in conventional primases begins when the primase binds the single-stranded DNA (ssDNA) template (Figure 9A). Eukaryotic primases usually do not show a strict sequence preference, although pyrimidine-rich template regions are favored (Holmes et al., 1985), given that primers generally start with the more abundant purine nucleotides (Yamaguchi et al., 1985). Conversely, some primases, especially those from bacterial origin, start primer synthesis at preferred template initiation sites: DnaG from Escherichia coli starts synthesis at 3´-GTC-5´ (Hiasa et al., 1989), T7 primase initiates at 3´-CTG- 5´ (Mendelman and Richardson, 1991, Tabor and Richardson, 1981), and herpes virus recognizes 3´-GTCC-5´ (Cavanaugh and Kuchta, 2009, Tenney et al., 1995).
Primases can initiate de novo RNA synthesis as they have two NTP binding sites: the 5´-site or initiation site that establishes the NTP at the 5´-end of the primer, and the 3´-site or elongation site that accommodates incoming NTPs at the 3´-end of the growing primer (Frick et al., 1999). Formation of the initial dimer is the rate-limiting step in which the two NTPs, complementary to the template initiation site, are not bound simultaneously to their sites:
binding of the 3’NTP is assumed to occur first (Figure 9B). Dimer synthesis takes place only after binding of the 5’NTP, which has 10-fold lower affinity than the 3’NTP, releasing inorganic pyrophosphate (Figure 9C) (Sheaff and Kuchta, 1993).
The nucleotide incorporated at the 5'-end of the primer is normally a purine, ATP or GTP, which maintains its triphosphate after the synthesis of the primer (Yamaguchi et al., 1985). Conventional primases accept modifications at the triphosphate end of the nucleotide incorporated at the 5'-site of the primer. Thus, nucleotide analogues with modifications at the 5'-phosphate group (Frick et al., 1999, Kusakabe et al., 1999) and even nucleotides bound to proteins through their 5'-phosphate can be incorporated into the 5'-end of the primers (Mustaev and Godson, 1995, Sun and Godson, 1998b, Sun and Godson, 1998a). The initiating dimer (pppN-p-N) is then translocated to make room for the next incoming nucleotide, and elongation of the primer continues processively usually up to a determined length (5–12 nt) (Figure 9D).
The length of the synthesized primer varies for each primase, and is likely regulated by the length of the oligoribonucleotide that can be accommodated at the initiation site of the enzyme.
This primer size regulation is presumably adjusted to the fact that DNA polymerases require a primer of a defined minimum length.
Many prokaryotic and eukaryotic primases can insert nucleotides with low fidelity, as they lack 3´-5´ exonuclease activity (McMacken and Kornberg, 1978, Cotterill et al., 1987,
Johnson et al., 2000). In a collateral way, primases can generate both "abortive primers"
(smaller) and multimeric primers (multiple of unit primer size; (Suzuki et al., 1989)). Finally, primases transfer the nascent oligoribonucleotide to a DNA polymerase to continue DNA synthesis (Figure 9E). In prokaryotes, the interaction between the primase and the DNA polymerase is transient: the primase associates and dissociates from the replication fork while catalyzing multiple initiation rounds (Nakai and Richardson, 1988, Yuzhakov et al., 1999). In eukaryotes, a heterotetrameric complex involving the primase and a specific DNA polymerase (Polα) guarantees both the synthesis of the primer and its safe transference to the polymerase active site to be elongated with dNTPs, producing a hybrid (RNA-DNA) mature primer (Sheaff and Kuchta, 1993).
Figure 9. Schematic represention of primer synthesis steps. (A) DNA binding. (B) Nucleotides binding. (C) Initiation or formation of the dinucleotide. (D) Primer elongation. (E) Transfer of the primer to the replicative polymerase. Reproduced from (Frick and Richardson, 2001)
3.1. A two-metal ion mechanism for the phosphoryl transfer reaction is common to DNA polymerases and primases
The DNA replication machinery is responsible for the maintenance of the integrity and stability of genetic information in DNA organisms. Primases catalyze the synthesis of short RNA sequences that serve as ‘primers’ for DNA polymerases (Frick and Richardson, 2001), but these primers are removed. Consequently, DNA polymerases are essential both for maintaining the physical integrity of the genome and the fidelity of its duplication. Most organisms since bacteria up to higher eukaryotes are endowed with several DNA polymerases that are highly specialized in various biological processes such as DNA replication, DNA repair and damage tolerance (Bebenek et al., 2003, Garcia-Diaz et al., 2007).
All DNA polymerases catalyze the same nucleotidyl-transfer reaction between a dNTP and the 3’ hydroxyl (OH) group of a pre-existent primer hybridized to a DNA template, which directs nucleotide incorporation through Watson-Crick base pairing (Johnson, 2008, Johnson, 2010). Primases and DNA polymerases, despite the high degree of specialization, have common characteristics including a two-metal ion mechanism of catalysis (Steitz et al., 1994, Joyce and Steitz, 1994, Steitz and Steitz, 1993, Stahley and Strobel, 2005).
The chemistry of a phosphoryl transfer reaction begins with deprotonation and activation of a nucleophile and finishes with protonation of a leaving group via a water molecule. Catalysis occurs through a chemical reaction type SN2 where the α phosphate goes through a transition state that involves a pentacovalent phosphate intermediate and inversion of stereo configuration at phosphorus (Figure 10) (Brody and Frey, 1981, Burgers and Eckstein, 1979, Mizuuchi et al., 1999). Metal ion A activates the OH group of the 3'-end of nascent DNA chain establishing interactions with it. This interaction facilitates the attack of the hydroxyl group to the α-phosphate of the incoming dNTP (Steitz and Steitz, 1993). Both metal ions A and B help to stabilize the intermediate that occurs during phosphodiester bond formation. In the final step of the reaction, metal ion B facilitates the release of the pyrophosphate during phosphodiester bond formation and stabilizes the negative charge accumulated in the oxygen and the one released by the chelation of the β and γ phosphates (Figure 10) (Yang et al., 2006).
The catalytically essential residues are carboxylates, which act as metal ligands at the active site. The two metal ions that stabilize the transition state are usually coordinated by two conserved aspartates present at the active site, a carbonyl oxygen, and the scissile α phosphate of an incoming nucleotide (Steitz and Steitz, 1993) (Figure 10). A key element of the two-metal-ion catalysis reaction is the proper alignment of the two metal ions with regard to the conserved carboxylates and nucleic acid substrates (Yang et al., 2006). Asp seems to be preferred for coordination of both metal ions, probably because it has fewer rotamer conformations than Glu and thus it is more rigid.
Figure 10. Diagram of two metal ion-dependent phosphoryl transfer reaction. (A) Substrates. The scissile phosphate can belong to a nucleic acid or nucleotide. A water molecule or sugar hydroxyl group needs to be deprotonated (blue) and activated to become a nucleophile. (B) Pentacovalent intermediate. The two metal ions are always coordinated by a non-bridging oxygen of the scissile phosphate and a conserved Asp, which may be substituted by a phosphate (phos) in ribozymes. (C) Products. A new phosphoryl bond is formed between the nucleophile and scissile phosphate with the phosphate configuration inverted, and the 3’OH leaving group is reprotonated. Reproduced from (Yang et al., 2006).
Recent work on Pol η revealed a possible intervention of a third metal in catalysis, probably stabilizing the transition state and facilitating the release of the product during nucleotidyl-transfer reaction (Nakamura et al., 2013). Polη is a member of the Y family DNA polymerases, involved in translesion synthesis through cyclobutane pyrimidine dimers (CPDs) generated by ultraviolet light. The Y family is formed by polymerases specialized in translesion synthesis (TLS) which are generally damage-specific and operate by incorporating nucleotides with high efficiency either opposite the damaged site or beyond (Lehmann et al., 2007, Friedberg et al., 2005).
4. Limited similarities between primases and DNA polymerases
Eukaryotic primases share significant functional analogy with structurally characterized X-family DNA polymerases, which contribute to nonhomologous end-joining (Kirk and Kuchta, 1999a). DNA polymerases from the X family are Polβ, Polλ, Polµ, involved in repair, and the TdT, involved in generate variability during V(D)J recombination.
In many prokaryotes, a specific DNA primase/polymerase (PolDom) is required for NHEJ repair of DNA double-strand breaks (DSBs). PolDom is a member of the AEP superfamily that has the ability to generate template distortions and primer realignment (Pitcher et al., 2007b, Yakovleva and Shuman, 2006). Comparing Mycobacterium tuberculosis PolDom (Mt-PolDom) structures to those of the ternary Polλ–gapped DNA complex emphasize that PolDom-DNA interactions are reminiscent of the contacts observed in the structure of the evolutionary unrelated Polλ (Brissett et al., 2007).
Human cells have two DNA primases: Prim1 that operates at nuclear DNA and is a heterodimer (p49 + p58) wherein p48 is the catalytic subunit and PrimPol, the second human primase, involved in nuclear and mitochondrial replication. 3D-structure analysis of p49 reveals that eukaryotic primases maintain the conserved catalytic RRM-fold domain, but with a unique small helical subdomain not found in the archaeal and bacterial primases.
Otherwise, despite a significant difference in the structure outside the active site, human primase p49 has functional homology to the Polλ catalytic domain (Vaithiyalingam et al., 2014).
Superposition of the catalytic centers of p49 and Polλ revealed a close structural position of the metal ions, the conserved catalytic residues and the 3′ NTP supporting the idea that the 3′ NTP represents a catalytically active conformation (Vaithiyalingam et al., 2014). In spite of these limited structural similarities, eukaryotic primases utilize a similar mechanism to DNA polymerases, but requiring a catalytically competent conformation capable of initiating dinucleotide synthesis.
Increasing evidence indicate that X and Y-family DNA polymerases including Polι, Polβ, Polµ, and Polλ stimulate their catalytic activity by physiological concentrations of Mn2+ ions (Wang et al., 1977, Blanca et al., 2003, Frank and Woodgate, 2007, Martin et al., 2013).
Notwithstanding the higher level of free Mg2+ compared to free Mn2+, the inability of Mg2+ to stimulate nucleotide binding to human primase (p49) contrasts directly with the metal- dependent nucleotide binding of bacterial primases (Csernoch et al., 1998, Schramm and Brandt, 1986, Rymer et al., 2012). Even though Mn2+ has been shown to decrease the fidelity of human primase p48, it also stimulates its activity even in the presence of Mg2+, primarily by reducing the Km for nucleotide binding which increases the rate of both initiation and elongation in a template-dependent manner (Kirk and Kuchta, 1999b, Kirk and Kuchta, 1999a, Vaithiyalingam et al., 2014).
5. Human PrimPol
Human PrimPol is a monomeric 560-amino acid protein that has an N-terminal AEP-like catalytic domain which contains motifs A, B and C but also an extra C-terminal Zn finger- containing domain (ZnFD). The C-terminal domain of human PrimPol includes additional elements like the replication protein A (RPA)-interacting domain (Guilliam et al., 2015, Martinez-Jimenez et al., 2017).
PrimPol is the first DNA primase characterized in human cells with the ability to start DNA chains with dNTPs, and both primase and polymerase activities were shown to rely on the same active site (Garcia-Gomez et al., 2013, Bianchi et al., 2013, Wan et al., 2013). PrimPol incorporates preferably dNTPs during its different reactions, although it is not a proficient sugar
discriminator, as it can also use ribonucleotides (Garcia-Gomez et al., 2013, Martinez-Jimenez et al., 2015). PrimPol is a relatively error-prone polymerase that lacks proofreading activity and polymerizes in a distribute manner.
PrimPol can catalyze DNA synthesis using either Mg2+ or Mn2+, although it shows a clear preference for the latter, which stimulates its overall efficiency (Garcia-Gomez et al., 2013, Martinez-Jimenez et al., 2015). Different studies demonstrated that in contrast to Mn2+, Mg2+ favors error-free bypass of 8oxoG by PrimPol, although it decreases PrimPol efficiency (Zafar et al., 2014, Garcia-Gomez et al., 2013, Martinez-Jimenez et al., 2015). Interestingly, even though Mg2+ and Mn2+ are known to be simultaneously present in the cell, the effect of the combination of both metal cofactors on PrimPol-mediated reactions has never been considered up to now.
5.1. Human PrimPol active site and Zinc-finger domain
Some archeal and most eukaryotic PrimPols, including human PrimPol, have a slightly different metal binding motif since the second carboxylate is not an aspartate, but a glutamate (DxE in motif A) (Iyer et al., 2005). In addition to the DxE motif (or DxD in general) and the conserved third aspartic acid, there is a specifically conserved histidine in all the AEPs in a β- sheet within the sxH (motif B) located in a central position of the RRM (RNA recognition motif) (Iyer et al., 2005). This human PrimPol His169 is conserved in all the orthologs of this new PrimPol family. The possible catalytic residues of human PrimPol, involved in metal coordination and the interaction with the incoming nucleotide are: Asp114 and Glu116 in motif A, His169 in motif B and Asp280 in motif C. All three motifs are located within ModC (Figure 11A, B).
The human PrimPol catalytically-dead double mutant AxA (D114A/E116A) proved the importance of residues Asp114, Glu1116 of DxE motif in both primase and polymerase activities, supporting the existence of a common active site (Garcia-Gomez et al., 2013, Wan et al., 2013). It has been demonstrated by directed mutagenesis, in some previously characterized AEPs, that these conserved residues are essential for catalysis (Lipps, 2004, Lao-Sirieix and Bell, 2004).
Recently, the crystal structure of the catalytic core of human PrimPol has been resolved in complex with a template/primer, one metal ion (Ca2+) which occupies the metal B position, and dATP as incoming 3’-site nucleotide but, significantly lacking the second metal ion (A) at the active site and important regions as the ZnFD (Figure 11A) (Rechkoblit et al., 2016).
Figure 11. Crystal structure of human PrimPol ternary complex. (A) Ribbon representation of the overall structure of human PrimPol ternary complex with template-primer DNA and incoming dATP. The N-helix and modules ModN and ModC (dark blue, yellow, and cyan, respectively). The DNA is shown (gray sticks), the Ca2+ ion (light blue sphere), the templating base T and the incoming dATP (red) and the catalytic active-site residues Asp114, Glu116, and Asp280 (red). Yellow and cyan dashed lines depict unstructured regions in the ModN and ModC, respectively. (B) Close-up view of the PrimPol active-site region and its interactions with dATP. Adapted from (Rechkoblit et al., 2016).
PrimPol contains a Zinc Finger domain (ZnFD) that has three conserved cysteines and a histidine (Cys-His-Cys-Cys), with the potential to coordinate a Zn+2 atom to form a Zn-finger, very similar to that present in AEP primases from the herpesvirus family (Iyer et al., 2005). Zn- fingers are generally involved in the maintenance of protein integrity and/or protein-protein interactions, but also in recognizing specific template features at the initiation site of primases (Laity et al., 2001, Matthews and Sunde, 2002). Interestingly, herpes virus primase (UL52), which belongs to the same AEP subfamily as human PrimPol (Iyer et al., 2005), has been shown to require its C-terminal ZnFD for primase activity and even DNA binding (Chen et al., 2005). The ZnFD in human PrimPol was also shown to be essential for primase activity both in vivo and in vitro (Mouron et al., 2013, Keen et al., 2014b). A double point mutation in two of the Zn2+ ligands (C419G, H426Y) and a deletion mutant lacking the C-terminal domain (∆ZnFD;
lacking aa 410–560) were instrumental in demonstrating that this domain is specifically involved in DNA priming by PrimPol (Mouron et al., 2013).