DEPARTAMENTO DE BIOLOGÍA MOLECULAR FACULTAD DE CIENCIAS
UNIVERSIDAD AUTÓNOMA DE MADRID
Functional characterization of CAD, an antitumoral target controlling the de novo pyrimidine
biosynthesis
Francisco del Caño Ochoa Licenciado en Biología
Thesis Director:
Dr. Santiago Ramón Maiques
Structure and Function of Macromolecular Complexes Department of Genome Dynamics and Function
Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM)
3
ACKNOWLEDGMENTS
Ser agradecido no solo es de agrado para las personas a los que va dedicado, sino que también es beneficioso para mí mismo, ya que te ayuda a darte cuenta de las grandes personas que te rodean. Es por eso que tenía que incluir este apartado en mi tesis.
En primer lugar, y no podía ser otro, muchas gracias a Santiago. Muchas gracias por contar conmigo desde el principio sin conocerme, aún recuerdo como un martes cualquiera me llamaste cerca de las 8 de la tarde para decirme que habías girado tu sillón para elegirme como tu predoc. ¡Y así fue! Gracias por ayudarme todos estos años, por trabajar mano a mano conmigo cada semana y por motivarme cada día para seguir manteniendo la ilusión en mi carrera científica. Creo que eres todo un personaje (en el buen sentido), con mil historias a cada cuál más genial. Literalmente has sido mi “padre” científico en estos años de la tesis, y como todo padre, yo te tengo un gran cariño y admiración.
Siguiendo con la metáfora de la familia, después están mis “hermanas”. Muchas gracias a todas las chicas de nuestro grupo, que me habéis cuidado tan bien durante este tiempo, y que no se si podré agreceros con estas palabras todo lo que habéis hecho por mí durante estos años. Os he visto convertiros en grandes madres de pequeños churumbeles que sacan lo mejor de vosotras cada día.
Muchas gracias a la hermana mayor, que no podía ser otra que Ara: siempre la más responsable, asegurándose de que todos estemos bien y que no nos falte de nada.
Siempre dispuesta a hacer cualquier cosa sin importar lo que tenga entre manos. Muchas gracias por ser cómo eres, por haberme ayudado tanto con los experimentos y con otras tantas cosas.
La hermana mediana le toca a María, ya que siempre da un paso adelante cuando alguien está en apuros. Cuando llegaste al grupo (poco después de mí) creaste un buen rollo como jamás nunca había visto en ningún otro sitio, compenetrándote genial con todas.
Creo que eres la mejor posdoc del mundo mundial, que tienes unas ideas geniales, siendo supercreativa e ingeniosa, pero a la vez la más humilde. Te echaré mucho de menos ahora que nos abandonas, pero sé que al lado de Ara y Alba, y toda la gente del CNIO, estarás muy bien cuidada.
La hermana pequeña es, como no podía ser otra, ¡¡¡Albitaaaa!!! Como representante de las hermanas pequeñas, has probado que eres la más revoltosa, pero que a la vez puedes ser la más centrada, la que siempre saca el lado bueno de todo y demostrándonos que se puede ir por la vida siempre con una sonrisa. En estos dos años, te hemos echado mucho de menos. Gracias a ti, y a tu insistencia porque cogiera un chico, Santiago me eligió, y, por tanto, te tengo un agradecimiento especial. Fuera del curro, hemos hecho muchas cosas juntos, siendo mi compañera de cine, de noodles o de comida no tan sana, entre otras cosas. Gracias por todo lo que me has ayudado tanto en el labo como fuera.
Espero que podamos vernos más a menudo ahora que has vuelto a Madrid ;)
5 No podía olvidarme de mi otra hermanita, Marija. Aunque solo coincidimos apenas un año, creo que eres la chica más seria pero divertida a la vez, y que siempre tenías un momento para el “cachondeo”. ¡Siempre me acordaré de ti y espero que te vaya muy bien a ti y a tu familia por Australia!
Fuera de la familia, siempre están los amigos que son como hermanos o hermanas, y esos han sido claramente Chevi y Marta. Muchas gracias a los dos, porque todos los días en el laboratorio se hacían más divertidos con vosotros, con nuestros tres o cuatro coffee- breaks (algún día incluso más), los experimentos de sexy-ciencia o las discusiones filosóficas de los desayunos. Fuera del labo, nos hemos divertido mucho en esos jueves de cervezas, en nuestros viajes a los sincrotrones de Barcelona y Grenoble, y sin olvidar el fiestón de los San Fermines. Me acuerdo mucho de vosotros, y sé que, aunque nos hayamos separado (de momento) siempre podré contar con vosotros. No quiero olvidarme de Amaia y Raúl, a los cuáles le tengo mucho cariño y con los que nos hemos pegado unas buenas fiestas, y que espero que haya más por delante en los próximos años.
Muchas gracias a todos los que habéis estado alrededor de mí, aportando cada uno de vosotros un apoyo tanto científico como humano en cada uno de los días de estos cinco años. Muchas gracias a los Daniels (al propio Daniel y a Pilar, Johanne, Iván, Debi, Marta y Tania), a los pocos Guillermos que de verdad llegué a conocer (Jaime y Darío), a los Llorcas (Carlos, Ángel, Marina, Andrés, Javi, Carmen, Adrián…), a los Rafas (Rafa, Samu y Ana) y a las Ramones (nunca os he llamado así, pero se me ha ocurrido ahora, Clara y Belén). Muchas gracias por todos esos desayunos, cafés, cenas, cumpleaños y celebraciones varias que me permitieron conoceros a cada uno de vosotros. Tampoco me quiero olvidar de nuestro futuro superexperto en big-data Carlos, que espero que le vaya genial. No me perdonaría que me olvidara de ella, y por eso esta mención especial para Jaska, que siempre podía visitarla en la Jaskueva y hablar con ella de cualquier cosa.
No quiero olvidarme de otros tantos compañeros que han pasado por nuestro labo (como Carlos, Patricia, Garavito, Alessio, Fernando, Ricardo, Leo o María) que me han aportado tantas cosas científicas y personales. A otros tantos colegas que he molestado durante varios días para que me ayudaran en los experimentos, como son Vane, Diego (también Ximo y Manu) y a Luque. Como no, de todos mis compañeros del futbol, que todos los días me avisan para jugar y me dejan meter algún que otro gol. Y del CBM, quiero agradecer a los millenials Emilio y Enrique, que me hacen reír todos los días en las comidas y que puede ser el principio de una gran amistad ! (en Word también hay emojis). Igualmente a las chicas de Esteban, con las que compartimos laboratorio y repostería.
Por último, quiero acordarme de mi verdadera familia. Muchas gracias a todos ellos porque desde pequeño me han criado desde el cariño y la responsabilidad para que
pueda llegar hasta donde estoy hoy. Muchas gracias a mis padres, que, con tanto sacrificio y trabajo, han conseguido sacar adelante a cuatro universitarios (que pueden ser tres doctores en breve), que se dice pronto. A mi padre, que ya jubilado se ha
convertido en taxista, llevando a sus hijos en la ruta Priego-Córdoba-Madrid-Vitoria. A mi madre, que siempre con su cariño se preocupa porque estemos bien y sobretodo de que nos alimentamos bien, llenando el congelador de “taperwé” con cada visita a Madrid.
Muchas gracias también a mis hermanos, con los que aún comparto vacaciones y puentes, series y películas (se nos acabó los Vengadores, Juego de Tronos, y otras tantas…), y con los cuáles quiero seguir haciendo tantas cosas juntos durante toda mi vida. Por último, también quiero agradecer a mis tíos y tías, primos y primas, y tantos otros miembros de mi gran familia que me han apoyado siempre.
Si he llegado hasta aquí, es gracias a vosotros, FAMILIA, tanto a la biológica como a la científica, y quiero terminar diciendo que la familia no es algo importante, lo es TODO.
¡GRACIAS!
7
ABSTRACT
Pyrimidine nucleotides are essential compounds for the synthesis of nucleic acids and other key cellular processes. The cells obtain the pyrimidines through two different metabolic pathways depending on their developmental stage. In differentiated cells, pyrimidines are obtained mainly by recycling through salvage pathways, and the de novo synthesis of pyrimidines is low. In contrast, when cells grow and proliferate, the activation of de novo synthesis is necessary to fuel replication and to manufacture of other essential macromolecules. In animals, three of the six enzymatic activities that constitute the de novo synthesis pathway, carbamoyl phosphate synthetase (CPS), aspartate transcarbamoylase (ATC) and dihydroorotase (DHO) are fused into a single multifunctional protein called CAD.
This multienzyme protein initiates and controls the de novo synthesis of pyrimidines and is overexpressed in different types of cancer, which makes it a potential target for the development of antitumoral compounds. In recent years, our group has characterized the DHO and ATC enzymatic domains of human CAD, but beyond knowing the atomic structure and kinetic properties, it is necessary to study CAD in a cellular context to better understand its functioning and move towards designing compounds that regulate their activity and may have a therapeutic value. In the course of this thesis, we have addressed the study of the subcellular localization of CAD using fluorescent chimeras and generating, through CRISPR/Cas9 technology, the first human CAD knockout and GFP-CAD-knockin cell lines.
Our results show that CAD is a protein present exclusively in the cytosol that, contradicting results published by other groups, is not transported to the nucleus during the cell cycle.
Until recently, it was thought that due to the central role of CAD in the synthesis of pyrimidines, mutations that compromised its activity would have a lethal effect, explaining that no diseases were associated with this gene. However, since 2015, it is known that CAD-deficit is a serious metabolic disease in children at an early age who die if they are not diagnosed in time. Until now, patients are diagnosed by exome sequencing, with the associated difficulty of distinguishing between possible pathogenic mutations and undescribed variants of the protein. Thanks to the molecular tools developed in this thesis, we set up a simple cell assay that allows the identification of pathogenic mutations, helping in the correct diagnosis and treatment of patients. In addition, we have studied the effect of the pathogenic mutations on the structure and activity of the isolated CAD domains. These clinical mutations have helped us to discover key elements for the functioning of the protein.
This detailed study of the mechanisms of CAD has led us to characterize in detail a flexible loop in the DHO domain of human CAD, and to describe its participation in the catalytic mechanism of the enzyme.
11
RESUMEN
Los nucleótidos de pirimidinas son compuestos esenciales para la síntesis de ácidos nucleicos y para otros procesos celulares claves. Las células obtienen las pirimidinas a través de dos rutas metabólicas distintas dependiendo de su estado de desarrollo. En células diferenciadas, las pirimidinas se obtienes principalmente a través del reciclaje mediante rutas de salvamento, y las síntesis de pirimidinas de novo es baja. En cambio, cuando las células crecen y proliferan, la activación de la síntesis de novo es necesaria para alimentar la replicación y la fabricación de otras macromoléculas esenciales. En animales, tres de las seis actividades enzimáticas que constituyen la ruta de síntesis de novo de pirimidinas, carbamil fosfato sintetasa (CPS), aspartato transcarbamilasa (ATC) y dihidroorotasa (DHO) están fusionadas en una única proteína multifuncional denominada CAD. Esta proteína multienzimática inicia y controla la ruta de síntesis de novo y está sobreexpresada en distintos tipos de cáncer, lo que la convierte en una diana potencial para el desarrollo de compuestos terantitumorales. En los últimos años, nuestro grupo ha caracterizado los dominios enzimáticos DHO y ATC de CAD humana, pero más allá de conocer la estructura atómica y las propiedades cinéticas, es necesario estudiar a CAD en un contexto celular para entender mejor su funcionamiento y avanzar hacia el diseño de compuestos que regulen su actividad y puedan tener un valor terapéutico. En el curso de esta tesis, hemos abordado el estudio de la localización subcelular de CAD usando quimeras fluorescentes y generando mediante la tecnología CRISPR/Cas9 las primeras líneas celulares humanas knockout para CAD y GFP- CAD-knockin. Nuestros resultados demuestran que CAD es una proteína presente exclusivamente en el citosol que, contradiciendo resultados publicados por otros grupos, no se transporta al núcleo durante el ciclo celular. Hasta hace poco, se pensaba que, debido al papel central de CAD en la síntesis de pirimidinas, cualquier mutación que comprometiera su actividad tendría un efecto letal, explicando que no se hubieran descrito enfermedades asociadas con este gen. Sin embargo, desde el 2015, se conoce que los defectos en CAD causan una enfermedad metabólica grave en niños en edad temprana que fallecen si no son diagnosticados a tiempo. Hasta ahora, los pacientes son diagnosticados por secuenciación exómica, con la dificultad asociada de distinguir entre posibles mutaciones patogénicas de variantes no descritas de la proteína. Gracias a las herramientas moleculares desarrolladas en esta tesis, hemos puesto a punto un ensayo celular sencillo que permite identificar mutaciones patogénicas y que ha ayudado en el diagnóstico y tratamiento de pacientes. Además, hemos estudiado el efecto de las mutaciones patógenicas sobre la estructura y actividad de los dominios de CAD aislados. Así, las mutaciones clínicas nos han servido para descubrir elementos de la proteína que son claves para su funcionamiento. Este estudio detallado de los mecanismos de CAD nos ha conducido a caracterizar en detalle un lazo flexible en el dominio DHO de CAD humana, y a describir su participación en el mecanismo catalítico del enzima.
15
TABLE OF CONTENTS
ACKNOWLEDGMENTS ... 3
ABSTRACT ... 7
RESUMEN ... 11
TABLE OF CONTENTS ... 15
ABBREVIATION LIST ... 21
INTRODUCTION ... 27
1. Biochemical functions of pyrimidines ... 29
2. De novo biosynthesis of pyrimidine nucleotides ... 30
3. Evolution of the enzymes in de novo pyrimidine synthesis ... 32
4. Piece-by-piece: deciphering the structure of CAD protein ... 36
4.1 GLN and SYN, the “undisclosed” domains of CAD ... 36
4.2 The cooperative ATC domain ... 40
4.3 A DHO domain in the midst of CAD ... 42
4.4 Putting the pieces together for the pyrimidine factory ... 45
5. The controversial location of CAD ... 48
6. CAD in health and disease ... 49
OBJECTIVES ... 55
OBJETIVOS ... 59
MATERIALS AND METHODS ... 63
1. Construction of recombinant plasmids ... 65
1.1 Preparation of host plasmids encoding GFP and Cherry tags ... 65
1.2 Cloning of human and hamster CAD ... 67
1.3 Incorporation of nuclear localization signals ... 67
1.4 Introduction of clinical mutations in the GFP-CAD construct... 68
1.5 Mutagenesis of the isolated DHO and ATC domains... 68
2. Cell culture ... 68
2.1 Adherent cells ... 68
2.2 Suspension cells ... 71
3. Transient transfection ... 71
3.1 Transfection of adherent cells ... 71
3.2 Transfection of suspension cultures ... 72
4. Fluorescence microscopy ... 72
4.1 Sample preparation for fluorescence microscopy ... 72
18
4.2 Sample preparation for immunofluorescence ... 72
4.3 Sample preparation for high-throughput fluorescence microscopy ... 73
4.4 Image acquisition ... 73
4.5 Quantification of fluorescent signal ... 73
5. Cell fractionation and immunoblotting ... 74
6. Antibodies ... 74
7. CRISPR/Cas9 to knock-out CAD ... 75
7.1 Design and cloning of sgRNAs ... 75
7.2 Transfection and selection of CAD deficient cells ... 75
8. CRISPR/Cas9 to knock-in GFP ... 78
8.1 Engineering a donor plasmid for homology-directed repair ... 78
8.2 Co-transfection and selection of positive clones ... 79
9. Cell proliferation assay ... 79
10. Purification of human DHO and ATC mutants ... 79
10.1 Purification of DHO mutants ... 79
10.2 Expression and purification of ATC mutants ... 80
11. Activity assays ... 82
11.1 DHO enzymatic assay ... 82
11.2 ATC enzymatic assay ... 82
12. Crystallization... 83
13. Data collection and structure determination ... 83
14. SEC coupled to Multi-Angle Light Scattering (SEC-MALS) ... 83
RESULTS ... 85
1. Location of CAD in the cell ... 87
1.1 Localization of recombinant fluorescent CAD... 87
1.2 Forcing recombinant CAD into the nucleus ... 89
1.3 Localization of endogenous CAD ... 90
1.4 Generation of a CAD-knockout human cell line using CRISPR/Cas9 ... 93
1.5 Generation of a fluorescent CAD knock-in cell line ... 95
2. Identification of pathogenic mutations in CAD... 97
2.1 A growth complementation assay in CAD-KO cells ... 97
2.2 Assessing the damaging potential of CAD clinical mutations ... 98
2.3 Characterization of CAD-KO cells ... 101
3. Characterization of clinical mutations in the ATC domain ... 103
3.1 Production of human ATC clinical mutants ... 103
3.2 Characterization of the kinetic parameters of ATC clinical mutants ... 106
3.3 Crystallization and structure determination of ATC-E2128K mutant ... 108
4. Characterization of clinical mutants in the DHO domain ... 111
4.1 Producing human DHO mutants R1475Q and K1482M ... 111
4.2 Structural characterization of DHO mutants ... 111
4.3 Characterization of the kinetic parameters of DHO mutants ... 118
5. Characterization of the catalytic flexible loop in the DHO domain of CAD ... 120
5.1 A human DHO chimera with the flexible loop of E. coli DHO is inactive... 121
5.2 A distinctive Phe in the flexible loop of DHO domain is key for catalysis ... 124
5.3 Structural characterization of the DHO F1563 mutants ... 125
DISCUSSION ... 133
1. CAD localizes exclusively in the cytoplasm... 135
2. Editing the CAD locus with the CRISPR/Cas9 system ... 137
3. A “rapid” growth complementation assay to assess the disease-causing potential of CAD missense mutations ... 139
4. Understanding the molecular mechanism of CAD-pathogenic mutations.... 141
4.1 Mutations in CAD’s DHO domain ... 141
4.2 Mutations in CAD’s ATC domain ... 143
4.3 Pathogenic mutations in CPS-2 ... 145
5. The importance of the flexible loop in the DHO domain ... 149
6. Final remarks... 153
CONCLUSIONS ... 155
CONCLUSIONES ... 159
REFERENCE LIST ... 163
21
ABBREVIATION LIST
Asp Aspartate
ATC Aspartate transcarbamoylase ATP Adenosine triphosphate BHK21 Baby hamster kidney cell line
BSA Bovine serum albumin
CA-asp Carbamoyl aspartate
CAD Carbamoyl-phosphate synthetase II, Aspartate transcarbamoylase and Dihydroorotase
CAD-KO CAD-knockout cell line
CDP Cytidine diphosphate
CHO Chinese hamster ovary cell line
CPS Glutamine-dependent carbamoyl-phosphate synthetase CRISPR Clustered regularly interspaced short palindromic repeats
DBS Double strand breaks
DHO Dihydroorotase
DHODH Dihydroorotate dehydrogenase DMEM Dulbecco’s Modified Eagle’s medium DMSO Dimethyl sulfoxide
EDTA Ethylenediaminetetraacetic acid EdU 5-ethynyl-2´-deoxyuridine EFL E. coli flexible loop chimera
FACS Fluorescence-activated cell sorting
FBM Fetal bovine macroserum
FBS Fetal bovine serum
FDA US Food and Drug Administration
FOA Fluoorotic acid
G9C CHO-derivate cell line deficient in CAD protein GFP Green fluorescent protein
GFP-CAD KI CAD fluorescent GFP knock-in cell line GLN Glutaminase activity of CPS
GST Glutathione S-transferase H2Ax H2A histone family member X
HEK293 Human embryonic kidney human cell line HeLa Human cervix adenocarcinoma cell line
IMP Inosine 5-monophosphate
kcat Turnover rate of the reaction KD Dissociation constant
24
Ki Inhibitory constant
KM Michaelis-Menten constant LB Luria Bertani growth media MAPK Mitogen-activated protein kinase MBP Maltose binding protein
MD Molecular dynamics
mTORC Mechanistic target of rapamycin complex
MW Molecular weight
NAG N-acetyl-D-glucosamine
NES Nuclear export signal NLS Nuclear localization signal
OMP Orotidine 5-monophosphate
OMPDC OMP decarboxylase
OPRT Orotate phosphoribosyltransferase PAGE Polyacrylamide gel electrophoresis PALA N-phosphonacetyl-L-aspartate PAM Protospacer adjacent motif PBS Phosphate-buffered saline PCR Polymerase chain reaction
PDB Protein data base
PEI Polyethylenimine
Pi Inorganic phosphate
PKA cAMp-dependent protein kinase A PRPP 5-phosphoribosyl-1-pyrophosphate PVDF Polyvinylidene difluoride
RMSD Root-mean square deviation
RPM Revolution per minute
SDS Sodium dodecyl sulphate
SEC Size exclusion chromatography
SEC-MALS SEC coupled to multi-angle light scattering
sgRNA Small-guide RNA
SYN Carbamoyl-phosphate synthetase activity of CPS TCEP Tris (2-carboxyethyl) phosphine
U2OS Human bone osteosarcoma cell line
UDP Uridine diphosphate
UMP Uridine 5-monophosphate
UMPS UMP synthetase
UTP Uridine triphosphate Vmax Maximum rate of reaction
WT Wild type
27
INTRODUCTION
1. Biochemical functions of pyrimidines
Pyrimidines are heterocyclic aromatic compounds with two nitrogen atoms at positions 1 and 3 in the six-carbon ring structure (Figure 1). The name “pyrimidine” immediately brings to mind the three major bases, uracil, thymine and cytosine, which together with the purine bases make the building blocks of RNA and DNA molecules. But, pyrimidines also occur as free nucleotides (e.g. uridine 5-monophosphate, UMP) and nucleosides (e.g. uridine) in all living tissues, where they are not only nucleic acid precursor but function in their own right as allosteric regulators and signaling molecules (Anderson and Parkinson, 1997; Brown, 1998). In addition, uridine diphosphate (UDP) and cytidine diphosphate (CDP) play prominent roles as activators and carriers of intermediate metabolites (Figure 1). The nucleotide sugar UDP-glucose is an activated form of glucose used by glucosyltransferases for the interconversion of glucose and galactose, for the biosynthesis of polysaccharides
Figure 1. Biological importance of pyrimidines. Schematic representation of pyrimidine containing molecules, with their chemical structure and possible biological functions.
30
(e.g. glycogen and cellulose), and for the glycosylation of proteins (Bulter and Elling, 1999).
Other pyrimidine-activated sugars include UDP-uronic acids (e.g. UDP-glucuronic acid) and UDP-aminosugars (e.g. UDP-N-acetylglucosamine) used for making proteoglycans, glycolipids and mucopolysaccharides. Similar to the role of UDP as a sole carrier of sugars, CDP appears to have been selected by nature for transporting alcohols during the metabolism of phospholipids. As such, the activated compounds CDP-choline and CDP- ethanolamine are used for the transformation of diacylglycerol into phosphatidylcholine and phosphatidylethanolamine, whereas CDP-diacylglycerol is an intermediate for the synthesis of phosphatidylinositol and of cardiolipin (an important component of the inner mitochondrial and bacterial membranes) (Carrasco and Merida, 2007).
Pyrimidines are also part of the structure of a number of natural compounds with diverse biological activity, ranging from vitamins (B1, thiamine; B2, riboflavin; B9, folic acid) to plant toxins (e.g. vicine) and anti-virals (e.g. 2-thiouracil) (Figure 1). In addition, a variety of synthetic pyrimidine derivates are of importance for the treatment of different diseases (Jain et al., 2016). Most of these compounds are often related to the endogenous substrates that they antagonize, and are used as sedatives and anticonvulsants (e.g. barbituric acid), wide-spectrum antibiotics (e.g. bacimethrin), antifungal agents (e.g. flucytosine), antiviral therapy (e.g. iamivudine or Retrovir for AIDS treatment), analgesics (e.g. epirazole) or antitumoral drugs (e.g. 5-fluorouracil, 6-azauridine) (Figure 1). In agriculture, several pyrimidines have also found wide applications as highly effective broad-spectrum herbicides with low mammalian toxicity (e.g. terbacil), fungicides (e.g. dimethirimol) or horticultural insecticides (e.g. pirimicarb).
2. De novo biosynthesis of pyrimidine nucleotides
Pyrimidine nucleotides are some of the most crucial components for cell growth, development and metabolism. The cells require a constant supply of pyrimidines that can be obtained by two alternative pathways (Jones, 1980) (Figure 2). Pyrimidines can be built de novo (from scratch) from ammonia, bicarbonate, aspartate, ATP and 5-phosphoribosyl- 1-pyrophosphate (PRPP). Alternatively, the cell employs salvage pathways to reutilize nucleosides (e.g. uridine) or pyrimidines bases (e.g. uracil) from intracellular breakdown of nucleic acids (pyrimidine catabolism) or acquired from the diet. The relative contribution of the de novo and salvage pathways is dependent upon cell type and development stage (Evans and Guy, 2004). In general, it is assumed that non-growing cells maintain their pool of nucleotides by salvage pathways and de novo synthesis is low, whereas in proliferating cells, the large demand of pyrimidines is met by de novo synthesis. Indeed, the de novo biosynthesis of pyrimidines is found invariably up-regulated in tumors and neoplastic cells (Aoki and Weber, 1981; Swyryd et al., 1974).
The de novo pyrimidine biosynthesis pathway is usually defined as six sequential enzymatic reactions leading to the formation of UMP. Figure 2 depicts this process as an assembly line, where the product of one enzyme acts as the substrate of the next, and the
Figure 2. Schematic outline of de novo and salvage pathways for the biosynthesis of pyrimidine nucleotides in animals. The six sequential steps for de novo synthesis of UMP are framed in grey background. In eukaryotes, the fourth reaction occurs inside the mitochondria. Molecules are depicted in different colors to indicate the origin of the atoms in the UMP nucleotide. Alternatively, UMP can be recycled from uracil or uridine through salvage pathways, shown in blue background. The numerous reactions that transform UMP into the different pyrimidine nucleotides are represented in brownish background, with final fates for the different nucleotides shown in pink.
32
semi-finished intermediates move somehow between active centers. This metabolic pipeline starts with the synthesis of carbamoyl phosphate (CP), a labile and high-energy phosphate metabolite that is also utilized for the biosynthesis of arginine (Jones, 1980); (Shi et al., 2018). CP is made from bicarbonate, ammonia (obtained from glutamine hydrolysis) and two molecules of ATP, in a four-step reaction catalyzed by the enzyme CP synthetase (CPS; EC 6.3.5.5). CPS is a complex machinery composed of two enzymatic activities, a glutamine-dependent amidotransferase (GLN) and a synthetase (SYN)1 (Figure 2). In a second step, CP and aspartate (Asp) are used by the enzyme aspartate transcarbamoylase (ATC; EC 2.1.3.2) to produce carbamoyl aspartate (CA-asp) and inorganic phosphate (Pi).
Next, the enzyme dihydroorotase (DHO; EC 3.5.2.3) catalyzes the condensation of CA-asp to dihydroorotate, delivering the first cyclic precursor of the nucleotide ring. Dihydroorotate is then oxidized to orotate by dihydroorotate dehydrogenase (DHODH; EC 1.3.5.2), an enzyme that in eukaryotes is tethered to the outer face of the inner mitochondrial membrane and couples pyrimidine synthesis to the respiratory chain (Figure 2) (Jones, 1980; Loffler et al., 1997). Orotate is a completed pyrimidine base and is incorporated to the activated ribose PRPP by the enzyme orotate phosphoribosyltransferase (OPRT; EC 2.4.2.10), producing the nucleotide orotidine 5-monophosphate (OMP). Lastly, the enzyme OMP decarboxylase (OMPDC; EC 4.1.1.23), converts OMP to UMP, which is further phosphorylated to UDP and UTP by different kinase activities. It is from these nucleotides that the other common pyrimidine derivatives arise though a number of phosphorylation and conversion reactions (Figure 2).
3. Evolution of the enzymes in de novo pyrimidine synthesis
Although the six enzymatic steps for de novo biosynthesis of UMP are evolutionary conserved in all living organisms, the organization, regulation and subcellular localization of the enzymatic activities involved in the pathway vary greatly in prokaryotes, plants, fungi and animals (Davidson et al., 1993; Evans and Guy, 2004; Jones, 1980) (Figure 3).
In most prokaryotes and plants, the first three reactions of the pathway are catalyzed by three distinct enzymes that work independently or forming more or less transient complexes (Jones, 1980)(Figure 3). In these organisms, a single CPS, formed by the non- covalent association of two protein subunits with GLN and SYN activities, makes CP for the biosynthesis of both pyrimidines and arginine. Therefore, ATC becomes the first committed
1 To avoid confusion, we will refer to the CPS synthetase activity as “SYN” (Davidson et al., 1993), although often in the literature, the synthetase subunit/domain of CPS is likewise named as “CPS”.
enzyme for de novo synthesis of pyrimidines, and in some bacteria, like Escherichia coli, ATC is under a strict allosteric control, being feedback inhibited by the pyrimidine nucleotides UTP and CTP and activated by ATP (Allewell, 1989; Jacobson and Stark, 1973). E. coli ATC is among the best characterized allosteric enzymes, and has become a textbook example and a paradigm of allosteric and cooperativity mechanisms in proteins (Lipscomb, 1994). The enzyme is formed by two trimers of catalytic subunits related by three dimers of regulatory subunits where nucleotide effectors bind, inducing large conformational changes in the holoenzyme. Other non-regulated bacterial ATCs are less well understood and consists of catalytic trimers without regulatory subunits that function independently or forming non-covalent complexes with DHO (Ahuja et al., 2004; Zhang et al., 2009). In plants, on the other hand, ATC is a trimeric enzyme localized in the chloroplast, with no regulatory subunits, but yet feedback regulated by direct binding of UMP to an uncharacterized pyrimidine-binding site (Cole and Yon, 1984). Thus, in bacteria seven genes encode the six enzymatic activities for de novo pyrimidine biosynthesis (GLN and SYN are considered as a unique CPS activity), whereas in plants, there are only six genes, since OPRT and OMPDC activities are fused into a bi-functional protein named UMP synthetase (UMPS) (Figure 3).
The scenario is more puzzling in other organisms, where the enzymatic activities appear highly organized. Early studies indicated that in simple eukaryotes, such as Neurospora or Saccharomyces, two CPSs provide distinct intracellular CP pools for pyrimidine and arginine synthesis (Figure 3). The arginine-specific CPS, formed like in bacteria by the non-covalent association of two proteins with GLN and SYN activities, locates in the mitochondria (Lacroute et al., 1965), whereas the pyrimidine-specific CPS and ATC activities are present in a single bi-functional protein encoded by the genes pyr-3 in Neurospora (Williams et al., 1970; Williams and Davis, 1970) or ura2 in S. cerevisiae (Lue and Kaplan, 1969). This bi-functional protein also contains an inactive DHO-like domain, and these organisms have a monofunctional DHO enzyme encoded by an independent gene (ura4 in S. cerevisiae) (Denis-Duphil, 1989; Souciet et al., 1989) (Figure 3). Thus, in fungi, only five genes code for the six enzymatic activities for UMP synthesis.
Similarly, in animals, distinct mitochondrial and cytosolic CPSs make CP pools for arginine and pyrimidine synthesis, respectively (Jones, 1980) (Figure 3). In most terrestrial vertebrates, the arginine-specific CPS (CPS-1; EC. 6.3.4.16), formed by the fusion of an inactive GLN domain and a SYN domain, is only expressed in the mitochondria of hepatocytes, requires N-acetyl-L-glutamate (NAG) as a co-factor (Jones, 1980; Rubio et al., 1981) and is used to detoxify free-ammonia directly through the urea cycle (Stewart and Walser, 1980). On the other hand, early studies proved the existence of a pyrimidine- specific CPS (CPS-2) that co-purified with the ATC activity in some sort of cytosolic complex
34
(Hoogenraad et al., 1971). Unlike in fungi, this complex was shown to also contain DHO activity (Shoaf and Jones, 1971), leading M. E. Jones to postulate that the initial three enzymes for the de novo biosynthesis of pyrimidines in mammals could form a stable complex or be part of a single multienzymatic protein (Jones, 1980). This association was named CAD for the first letters of the distinct enzymatic activities (CPS, ATC and DHO), and was subsequently found in every animal species investigated, from Dictyostelium to human (Evans, 1986; Jones, 1980).
The nature of the CAD multienzymatic association was not unveiled until it was purified to homogeneity by the group of G. Stark from hamster cells treated with the drug PALA (N- phosphonacetyl-L-aspartate). PALA was synthetized as a potent inhibitor of E. coli ATC that combines features of both enzyme substrates (CP and Asp) and was postulated to mimic the transition state of the reaction (Collins and Stark, 1971). PALA was also effective in Figure 3. Organization of the enzymes involved in de novo pyrimidine synthesis. Schematic representation of the six enzymatic activities needed for de novo pyrimidine biosynthesis of UMP in eubacteria, plants, fungi and animals. The rectangles represent individual proteins or domains within multienzymatic proteins, such as CAD, which fuses the CPS-2, ATC and DHO activities. A second arginine-specific CPS, present in the mitochondria of fungi and animals, is also shown. The allosteric regulation of CPS and ATC activities by activators (in blue) and inhibitors (in red) is indicated. In some bacterial ATCs, an additional regulatory subunit, depicted as a circle, is responsible for the binding of the allosteric effectors. In mammals, CAD is also controlled by phosphorylation at different sites, depicted as red circles.
inhibiting the growth of mouse tumors, proving for the first time that de novo pyrimidine synthesis is required for cell proliferation (Swyryd et al., 1974; Yoshida et al., 1974).
However, some of the cells developed resistance to the inhibitory effect of PALA by overproducing more than 100 times the original activity of the three-enzyme complex (Kempe et al., 1976). When the complex was isolated from PALA-resistant Chinese hamster ovary (CHO) cells, the three catalytic activities were found integrated within a single polypeptide of ~250 kDa that self-associated in a mixture of oligomeric forms, mostly trimers and hexamers (Coleman et al., 1977; Kempe et al., 1976). This physical association between the three enzymes was confirmed by combined genetic, biochemical and immunological approaches using CAD defective mutants in CHO cells (Davidson and Patterson, 1979).
The attempts to purify CAD revealed its extraordinary susceptibility to proteolytic cleavage, which allowed the isolation of different protein fragments retaining specific enzymatic activities (Davidson et al., 1981; Evans, 1986; Kim et al., 1992; Mally et al., 1981).
The analysis of the proteolytic fragments and of mutants in the CAD locus in Drosophila (rudimentary gene) provided strong evidence of a simple domain structure, where enzymatic activities were present in CAD as different functional domains connected by more or less unstructured linker regions. The correct order of the domains, GLN-SYN-DHO-ATC (Figure 3), was not confirmed until the genes from different organisms were fully sequenced [Drosophila (Freund and Jarry, 1987); hamster (Shigesada et al., 1985); human (Davidson et al., 1990); Dictyostelium (Faure et al., 1989); and S. cerevisiae (Denis-Duphil, 1989;
Souciet et al., 1989)]. In addition, a number of gene dissection experiments proved that fragments of CAD could complement E. coli deficient in CPS, DHO or ATC activities, indicating that the isolated enzymatic CAD domains were functional and topologically independent (Davidson et al., 1993).
As in plants, the OPRT and OMPDC activities in animals are also fused into a single bi-functional UMPS protein (Figure 3) (Jones, 1980; Traut and Jones, 1979). Thus, in animals, two cytosolic multienzymatic proteins, CAD and UMPS, are responsible for catalyzing five out of six steps for UMP synthesis, while as already mentioned, the reaction catalyzed by the dehydrogenase occurs inside the mitochondria (Jones, 1980; Shoaf and Jones, 1971) (Figure 2).
The evolutionary diversity shown in Figure 3 is difficult to reconcile with the view of a metabolic assembly line formed by enzymes functioning as autonomous catalytic units (Figure 2). Enzymes work more efficiently when assembled into complexes that favor the communication, coordination and interdependence of the different activities and have the potential to express unique catalytic and regulatory properties (Gaertner, 1978). Likely, the most effective of these associations is the covalent linkage of the different enzymes into a
36
multifunctional conjugate such as CAD, which ensures an equimolar production of the enzymatic activities acting sequentially in the pathway, their proximity and co-localization in the cell (Davidson et al., 1993). In the past years, the main research objective of our group has been to investigate the architecture and functioning of CAD and to explain how this association helps to make pyrimidines in a more efficient manner.
4. Piece-by-piece: deciphering the structure of CAD protein
Although bacterial CPS, ATC and DHO activities have been well characterized both biochemically and structurally, the large size of CAD and its high sensitivity to proteases hampered so far all attempts to decipher its structure. Until recently, it was only known that CAD self-assembles into hexameric particles of ~1.5 MDa (nearly half of size of a ribosome) (Coleman et al., 1977);(Lee et al., 1985)), but there was no detailed information about CAD nor about any of its functional domains that could shed light on the architecture and functioning of the multienzymatic protein. However, during the course of this Thesis, our group succeeded in determining the crystal structures of the isolated DHO and ATC domains of human CAD (Grande-Garcia et al., 2014; Lallous et al., 2012; Ruiz-Ramos et al., 2013; Ruiz-Ramos et al., 2016). Although not included in the results of this Thesis, I contributed to the characterization of these domains and to understand how they organize into larger assemblies (Moreno-Morcillo et al., 2017). In addition, during this period, the group of Vicente Rubio (Instituto de Biomedicina de Valencia, IBV-CSIC) also determined the crystal structure of human CPS-1 (de Cima et al., 2015), demonstrating the evolutionary conservation with E. coli CPS, the only other enzyme of this kind for which the crystal structure had been determined (Thoden et al., 1997; Thoden et al., 1999b). Based on the structures of human CPS-1 and E. coli CPS and on sequence similarities, we started to envision how the structurally uncharacterized CPS-2 activity of CAD could function. In the next sections, I summarize some relevant data that bring us closer to understanding the architecture and functioning of CAD.
4.1 GLN and SYN, the “undisclosed” domains of CAD
CPSs are truly remarkable and large protein machineries dedicated to the synthesis of the small and highly unstable CP molecule. These enzymes share a common reaction mechanism (Figure 4A), involving three discrete chemical steps with canalization of highly reactive and unstable intermediates: carboxyphosphate, ammonia and carbamate (Anderson and Meister, 1965; Meister, 1989). In a first reaction, a molecule of bicarbonate is phosphorylated at the expense of one ATP molecule to form carboxyphosphate and ADP (Rubio and Grisolia, 1977). Next, a molecule of ammonia hydrolyzed from glutamine or obtained directly (urea cycle CPS-1) reacts with carboxyphosphate to form carbamate and
inorganic phosphate. In a final step, carbamate is phosphorylated by consuming a second ATP molecule to form CP.
All known CPSs present substantial sequence homology. In particular, the CPS-2 domain of human CAD shares 40% and 51% sequence identity with E. coli CPS and human CPS-1, respectively (Figure 4B,C) (de Cima et al., 2015; Holden et al., 1998; Meister, 1989;
Raushel et al., 1998; Thoden et al., 2004; Thoden et al., 1999a, 2002; Thoden et al., 1998;
Thoden et al., 1999b; Thoden et al., 1999c; Thoden et al., 1999d). This sequence similarity and the structural resemblance between the evolutionary distant E. coli CPS and human CPS-1 proteins strongly suggest that all CPSs, including CAD’s CPS-2, will share a similar multidomain architecture.
As mentioned earlier, CPS is composed of two parts, a ~40 kDa GLN moiety that delivers ammonia and a ~120 kDa SYN moiety that catalyzes the three-step reaction for CP synthesis. The GLN and SYN activities exist either as different subunits, such as the small (GLN) and large (SYN) subunits of E. coli CPS, or are connected by a short linker within a single polypeptide as in the ammonia-dependent CPS-1 or as in CAD.
GLN exhibits a globular shape divided into an N-terminal ~16 kDa lobe (named S1) and a C-terminal ~25 kDa lobe (S2) characteristic of the class I family of amidotransferases (Nyunoya and Lusty, 1984) (Figure 4A-C). These glutaminases share a common reaction mechanism that involves the formation of a glutamyl-thioester intermediate between glutamine and a catalytic Cys (Thoden et al., 1999a). A conserved His is also important to activate the Cys for nucleophilic attack. In E. coli CPS, the catalytic residues, C269 and H353, locate at the S2 lobe, and the active site is formed at the interface with the S1 lobe (Thoden et al., 1999a). In contrast, in human CPS-1, the catalytic Cys is replaced by S294, explaining why this domain is inactive (Simmer et al., 1990; Thoden et al., 1999b). In CAD, on the other hand, the GLN catalytic Cys and His residues are conserved (C252 and H336 in human CAD) and thus, it is expected to share a common catalytic mechanism with E. coli CPS. Indeed, the isolated GLN domain of hamster CAD, produced recombinantly in bacteria, has been shown to be active when mixed in stoichiometric amounts with the large subunit of E. coli CPS (Guy and Evans, 1994). It is expected that in CAD, similarly to E. coli CPS and human CPS-1, both the S1 and S2 lobes participate in an extensive interaction with the SYN domain. This interaction has been shown to increase the stability of both domains (Cervera et al., 1996; Cervera et al., 1993) and somehow ensures the synchronization and enhancement of their activities (Hewagama et al., 1999), likely to avoid wasteful hydrolysis of glutamine or ATP if the other substrates of the reaction are not available.
38
Figure 4. Structural conservation between CPSs from E. coli and human CPS-1. (A) Schematic representation of the partial reactions catalyzed by each domain within the protein. (B,C) Cartoon representation of one subunit of the tetramer of E. coli CPS (B) or one subunit of the dimer of human CPS-1 (C), with the different domains colored as in (A). Substrates and allosteric effectors are represented as spheres in different colors. The internal tunnel connecting the three active sites is represented as cyan mesh. (D) Superposition of L1 and L3 domains bound to AMPPNP, indicating the three lobes of the ATP-grasp fold. (E) Detailed view of the superposition of L3 domain. ADP and inorganic phosphate (Pi) are shown as green sticks; potassium and divalent metal ions (M2+) are shown as blue or pink spheres.
The SYN moiety, on the other hand, is divided into four structural units labelled as L1–
4 (Figure 4A). L1 and L3 correspond to two equivalent phosphorylation domains (Britton et al., 1979), which share 40% sequence identity and probably arose by a gene duplication and fusion event (Nyunoya and Lusty, 1983). The crystal structures proved that these two synthetic components form a pseudo-homodimer with nearly exact twofold rotational symmetry and are topologically equivalent although not identical, as expected based on their different substrates and interactions with the rest of the protein (Figure 4C,D) (Thoden et al., 1997). Both synthetic components are structured in an “ATP-grasp” fold, with three lobes surrounding (“grasping”) the nucleotide (Fawaz et al., 2011) (Figure 4D). The active site, located between the B- and C-lobes, is virtually identical in E. coli CPS and human CPS-1, and the residues interacting with the nucleotide and metal ions are also predicted to be conserved in CAD (Figure 4E).
The synthetic L1 and L3 domains are linked by a ~20 kDa L2 domain that has received different names. In E. coli CPS, L2 participates in the formation of tetramers with similar catalytic properties as the functional heterodimers (Figure 4B). Although the interaction between the L2 domains is relatively small, as expected from the readily conversion of monomers and tetramers (Anderson, 1986), this L2 region was named as the
“oligomerization domain” (Thoden et al., 1997). Human CPS-1, on the other hand, exists in a monomer-dimer equilibrium in solution but there is no evidence of formation of tetramers.
Thus, L2 received the alternative name of the “integrating domain”, for its role in embracing the L1 domain and for connecting the L1 and L3 domains with the GLN moiety (Figure 4A) (de Cima et al., 2015).
The most C-terminal region in the SYN domain, L4, holds the binding site for the allosteric regulators (Figure 4A) (Cervera et al., 1996; Czerwinski et al., 1995; Rubio et al., 1991; Thoden et al., 1999c) (de Cima et al., 2015; Rodriguez-Aparicio et al., 1989). Despite having very different effector molecules (Figure 3), the crystal structures of E. coli CPS and human CPS-1 have shown that the allosteric domains exhibit a similar fold and bind UMP, IMP and NAG in an equivalent surface pocket. It has been proposed that the 20 kDa C- terminal region of the CPS-2 domain of CAD also holds the allosteric domain. This conclusion was based on the analogy with the other CPS enzymes and because the UTP and PRPP allosteric effect on CAD is lost when this region is deleted (Liu et al., 1994; Sahay et al., 1998).
Last but not least, the E. coli CPS structure revealed an exceptional feature: the GLN active site was 45 Å away from the bicarbonate phosphorylation site, and this one was 35 Å apart from the phosphorylation site for carbamate (Thoden et al., 1997), meaning that the unstable reaction intermediates, ammonia, carboxyphosphate and carbamate, had to be shuttled between the distant active centers without exposure to the bulk solvent. The
40
structure of E. coli CPS revealed the existence of a narrow tunnel (>90 Å long) running through the interior of the protein and connecting all three active sites (Figure 4B). As expected, this reaction tunnel was also found in the structure of human CPS-1. Moreover, the structures of human CPS-1 free or bound to NAG showed that the binding of the co- factor favors the correct formation of the tunnel, thus, explaining –at least in part– the mechanism of allosteric activation (de Cima et al., 2015). In human CPS-1, many of the residues that in E. coli CPS line the ammonia channel are conserved, but the Gly at the exit of the channel at the interface between the GLN and SYN domains is replaced by a Gln (Q318) that blocks the path (de Cima et al., 2015). Thus, in CPS-1, external ammonia enters the enzyme through a different pathway to that described in E. coli CPS (Figure 4C). It has been proposed that this alternative entry could also exist in other CPSs, including CPS-2 of CAD, since they are active with external ammonia although at high concentration (Kim et al., 2017).
4.2 The cooperative ATC domain
The arrangement of the domains along the CAD polypeptide does not follow the order of the reactions in the de novo pathway. ATC catalyzes the next step after CPS but yet, it is found at the C-end of the multifunctional protein (Figures 2 and 3). Our group determined the crystal structure of the ATC domain of human CAD, being the first eukaryotic ATC to be structurally characterized (Ruiz-Ramos et al., 2013; Ruiz-Ramos et al., 2016). As predicted (Scully and Evans, 1991), the domain shows high similarity with the catalytic subunits of bacterial ATCs. The enzyme is a homotrimer with equilateral triangular appearance, and with three active sites located in between the subunits (Figure 5A,B). Each subunit is divided in an N-terminal domain and a C-terminal domain of similar size, both structured by a central b-sheet of five parallel strands flanked by a-helices (Figure 5C). The active site locates at the cleft between the N- and C-domains with participation of a loop (CP-loop) from the adjacent subunit. The N-terminal domain provides most of the contacts with the other subunits and holds the binding site for CP, whereas the C-domain occupies an external position at the trimer and provides the binding site for Asp. There are two mobile loops, the CP-loop at the N-domain and the Asp-loop at the C-domain that are flexibly disordered and get rearranged upon substrate binding (Figure 5C).
ATC catalyzes an ordered reaction, with CP binding before Asp, and CA-asp leaving before phosphate (Collins and Stark, 1969; Porter et al., 1969). CP binding at the N-domain induces the positioning of the CP-loop from the adjacent subunit at the active site, and promotes a partial hinge-closure of the C-domain. These conformational changes favor the binding of Asp to a contiguous pocket in the active site. Binding of both substrates induces a further approximation between the N- and C-domains and a rigid body rotation of the Asp-
loop that closes the active site (Figure 5C). This conformation is also coupled to a global change in the relative position of the subunits within the trimer, adopting a more compact conformation. Overall, the movements of the CP and Asp-loops are proposed to facilitate the binding of the substrates in the correct orientation to favor the reaction and the approximation between the N- and C-domains imposes the strain needed to favor the reaction (Collins and Stark, 1969; Ruiz-Ramos et al., 2016).
The active site of human ATC is indistinguishable from that of bacterial homologues, confirming a common reaction mechanism for both enzymes (Collins and Stark, 1969;
Gouaux et al., 1987; Ruiz-Ramos et al., 2016). Figure 5D shows how every polar atom of Figure 5. ATC domain of human CAD. (A) Cartoon representation of the trimer formed by the isolated ATC domain of human CAD in two perpendicular orientations. Each subunit is represented in a different color and the space filling representation of the trimer is shown in transparency. (B) Space filling representation of the trimer rotated 90°. (C) Cartoon representation of human ATC subunit, with the N- and C-domains represented in blue and green, respectively, and the Asp-loop depicted in purple. The arrows indicate the closure movement of the subunit upon substrates binding. CP and Asp are shown in yellow and red spheres, respectively. The CP- loop from the adjacent subunit is depicted in red. (D) Detail of the interactions between PALA and residues at the active site.
42
PALA interacts with the protein, explaining the nanomolar affinity for the inhibitor (Newell et al., 1989; Ruiz-Ramos et al., 2016). The dissociation of PALA from this highly stable complex is impossible without reversing all the conformational changes, and this is a slow process that explains why the molecule acts as a nearly irreversible inhibitor (Ruiz-Ramos et al., 2016). Unexpectedly, we found that PALA binds to human ATC with negative cooperativity: the binding of PALA to one subunit decreases the affinity for the inhibitor in the other active sites (Ruiz-Ramos et al., 2016). Indeed, only two subunits of human ATC show high affinity for PALA, while affinity for the third site is 100-fold lower. This difference with the E. coli ATC catalytic trimer is likely due to a communication of conformational changes between the subunits in the human enzyme. Since PALA is proposed to resemble the transition state of the reaction, the negative cooperativity effect also suggests that in CAD, the ATC trimer might work more efficiently with only two active sites catalyzing the reaction at a time. Indeed, at high substrate concentrations, the activity of ATC is partially inhibited, suggesting that a trimer with the three subunits forced to work simultaneously might present additional intersubunit interactions that slow down the conformational movements required for catalysis (LiCata and Allewell, 1997; Ruiz-Ramos et al., 2016).
4.3 A DHO domain in the midst of CAD
Although the crystal structures of different bacterial DHOs were known (Thoden et al., 2001; Zhang et al., 2009), there were no structural information of any eukaryotic counterpart. The crystal structure of the DHO domain of human CAD reported by our group, was the first structural characterization of a eukaryotic DHO (Grande-Garcia et al., 2014;
Lallous et al., 2012). In agreement with studies reporting that the proteolytic fragment of CAD retaining DHO activity formed dimers (Davidson et al., 1981), we demonstrated that the isolated human DHO domain is indeed a homodimer in solution (Figure 6A,B) (Lallous et al., 2012). The overall structure of the human DHO subunit is similar to bacterial homologues, and is also shared with a large number of enzymes –most of which catalyze the hydrolysis of substrates at amide or ester groups– belonging to the amidohydrolase superfamily of proteins (Holm and Sander, 1997). Each subunit is structured in a “TIM” (for being first described in triosephosphate isomerase) barrel motif, with eight twisted parallel b-strands arranged in a closed barrel, eight antiparallel a-helices on the outside and a smaller adjacent b-stranded subdomain composed of the N- and C-terminal regions of the protein (Figure 6C). This adjacent subdomain is continued by a C-terminal extension that stretches through the bottom of the barrel, and places the N- and C-ends on opposite sides of the globular domain, a convenient arrangement for the intercalation of DHO in the middle of CAD.
As in other members of the amidohydrolase superfamily, the active center sits on the C-terminal edge of the b-barrel, a cavity shaped by the loops connecting the b-strands with the outer a-helices (Figure 6C). There are two Zn2+ ions (Zn-a and Zn-b) coordinated by four His and one Asp at conserved positions in the active site (Figure 6D). The metals are bridged by the side chain of a carboxylated Lys (KCX1556) and by a water molecule that is activated by the metal ions for nucleophilic attack (Porter et al., 2004). In addition, human DHO has a third Zn2+ ion (Zn-g) approximately at the center of the b-barrel, which is not found in bacterial DHOs or in other members of the amidohydrolase superfamily (Figure Figure 6. DHO domain of human CAD. (A) Space filling representation of the dimer formed by the isolated DHO domain of human CAD. (B) Cartoon view of the DHO dimer with dihydroorotate bound (represented in purple spheres) and indicating the N- and C-terminus of the protein. (C) Representation of the DHO subunit with Zn2+ ions (cyan spheres) and the different components of the protein. (D) Detailed view of the active site of DHO. CA-asp is shown in semi-transparent in which the b-COOH occupies the position of the bridging water (red sphere) between the Zn2+
ions. Residues of the central b-barrel or loops above the central barrel are colored in red and grey, respectively. The flexible loop is depicted in yellow. (E) Sequence alignment of the flexible loop region in the DHOs from E. coli, human and Aquifex, showing the residues to the secondary structure above the alignment. Highly conserved residues (>90%) in each group are in bold.
Distinctive signatures are highlighted on brown background and the residues involves in human DHO dimerization are shown on a grey background while the hinge residues are indicated by a purple box.
44
6D). This Zn-g interacts with Zn-a through the side chain of a rare histidine (H1471) with negative charge (called histidinate anion). Although Zn-g appears to be too far to participate directly in catalysis, mutations that impede the binding of this metal are shown to reduce the activity of the protein to half (Grande-Garcia et al., 2014). Interestingly, the introduction by site-directed mutagenesis of the third Zn2+ into a bacterial DHO, increased the activity and stability of the protein (Huang and Huang, 2015). These results suggest that Zn-g could play a role in the stabilization of the DHO domain of CAD, and perhaps influence the electrostatic environment at the active site (Grande-Garcia et al., 2014).
Despite having a low sequence identity (15%), human and E. coli DHOs show virtually identical active sites, indicating that both enzymes share a common catalytic mechanism (Grande-Garcia et al., 2014; Porter et al., 2004). The conversion of CA-asp to dihydroorotate is reversible and pH-dependent, with the forward and reverse reactions reaching equilibrium at approximately neutral pH (Christopherson and Jones 1980). In the synthesis of dihydroorotate, favored at low pH, the water molecule bridging Zn-a and Zn-b is displaced by the binding of the side chain of CA-asp (Figure 6D). The metals neutralize the negative charge of the carboxylate group, increasing its susceptibility to a nucleophilic attack by the amino group of CA-asp. To favor the reaction, the amino group is deprotonated by an Asp (D1686) acting as the general base. The reaction proceeds by formation of a tetrahedral intermediate that is stabilized by the metal ions. Then, the OH leaving group is protonated, collapsing the transition state and releasing dihydroorotate and a water molecule that is retained between the two metal ions. In turn, the hydrolysis of dihydroorotate is favored above pH 8 and involves the nucleophilic attack of the bridging water (or hydroxide ion) to the amide bond of dihydroorotate (Figure 6D). This reaction, rather than the forward synthesis of dihydroorotate, is favored under physiological conditions (Christopherson and Jones, 1980b). It is proposed that the equilibrium of the reactions could be displaced toward the synthesis of dihydroorotate by the location of CAD near the mitochondria, which may facilitate the efficient capture of dihydroorotate by DHODH, the enzyme catalyzing the next step in de novo pathway (Evans and Guy, 2004) (Figure 2).
The DHO subunit is rigidly built by tight hydrophobic packing of the central b-barrel and the a-helical palisade. Indeed, different crystal structures of human and E. coli DHOs show no significant conformational changes upon binding of substrates or inhibitors at the active site. There is only one exception: a loop connecting strand b4 and helix a4 that adopts an open solvent-exposed position or a closed conformation whether dihydroorotate or CA-asp are bound, respectively, to the active site (Grande-Garcia et al., 2014; Lee et al., 2005) (Figure 6C,E). This flexible loop reaches in toward the active site with CA-asp bound and is
proposed to aid in catalysis by orienting and increasing the electrophilicity of the substrate, excluding water molecules, and stabilizing the transition-state (Lee et al., 2007b).
Interestingly, some bacterial DHOs (type I DHOs; e.g. Aquifex aeolicus) lack the flexible loop and require the interaction with ATC to complete the active site and achieve maximal activity (Prange et al., 2019; Zhang et al., 2009). Additionally, the flexible loop exhibits a two aminoacidic signature that is characteristic for each DHO type (Grande-Garcia et al., 2014;
Lee et al., 2007a; Ruiz-Ramos et al., 2015). In all cases, the first residue is a Thr (T109 in E. coli DHO or T1562 in human DHO), which directly interacts with the b-COOH of CA-asp.
In E. coli and other bacterial type II DHO, the second specific residue is also a Thr (T110) that occupies the tip of the loop and binds through its side chain to the a-COOH group of CA-asp. Mutating either of the two Thr inactivates E. coli DHO, proving the importance of the flexible loop in the reaction (Lee et al., 2007b). In CAD´s DHO domain, on the other hand, the flexible loop is three residues shorter than in E. coli DHO and replaces the second Thr by a conserved Phe (F1563 in human DHO) (Figure 6E) (Grande-Garcia et al., 2014).
We previously demonstrated that mutations T1562A or F1563A impair the activity of human DHO (Grande-Garcia et al., 2014), proving that, as in E. coli DHO, the flexible loop also plays a key functional role in CAD.
4.4 Putting the pieces together for the pyrimidine factory
Based on the expected similarity of CPS-2 with E. coli CPS and human CPS-1 and on the crystal structures of the human DHO and ATC structures, we can build a hypothetical model of the full-length CAD protein with the four domains arranged as beads on a string (Figure 7A). However, this model is utterly useless unless we define how this protein self- assembles into larger particles that explain the communication and coordination between the different enzymatic activities.
As already mentioned, early studies reported that CAD self-assembles into a mixture of oligomers (Coleman et al. 1977), mostly hexamers of ~1.5 MDa in size (Lee et al., 1985).
Since the proteolytic fragments retaining DHO and ATC activities were found to form dimers and trimers, respectively (Davidson et al., 1981; Hemmens and Carrey, 1994; Kelly et al., 1986), an idea prevailed, that the CAD hexamers could result from the association of three proteins through their respective ATC domains, and that two of these trimers could further dimerize through their DHO domains (Carrey, 1995b; Evans, 1986). A similar organization was proposed for the architecture of the CAD-like protein in Neurospora (Figure 3) (Makoff et al., 1978), although in this model the dimerization of the trimers was proposed to be mediated by the CPS region. The key role of ATC in the molecular organization of CAD was demonstrated by Qiu and Davidson, who proved that mutations in the predicted ATC trimer interface caused the dissociation of CAD hexamers (Qiu and Davidson, 1998; Qiu and
46
Davidson, 2000a). However, this study ruled out the possibility that the DHO or CPS domains participated directly in the oligomerization of the particle.
To challenge the idea proposed by Carrey that CAD could assemble as a “dimer of trimers” (Carrey, 1995b), our group made a construct spanning the DHO and ATC domains of human CAD, including the long linker in between. This bi-functional construct was shown to form stable homo-hexamers in solution, and point mutations in the ATC or DHO oligomerization interfaces resulted in the formation of dimers and trimers, respectively (Moreno-Morcillo et al., 2017). These results indicated that CAD indeed assembles as a
“dimer of trimers”. Taking into consideration previous models (Carrey, 1995b; Evans, 1986;
Figure 7. Model of the architecture of CAD. (A) Model of CAD full-length protein with the GLN, SYN, DHO and ATC domains as beads on a string. Dashed lines indicate linker regions of different lengths. (B) Model of the bi-functional DHO-ATC construct forming the central core of the hexamer (or “dimer of trimers”). (C, D) Hypothetical model of CAD hexameric particle in two perpendicular orientations, including the dimer of trimers of DHO and ATC and three dimers of CPS-“ to form the CAD hexamer.
Makoff et al., 1978) and gathering all the structural information, we proposed a plausible blueprint for the central scaffold of CAD, with two ATC trimers oriented face-to-face and occupying apical positions of the particle, and three interposed DHO dimers oriented with their long axes in parallel to the ATC three-fold axes (Figure 7B). This results in a closed hexameric structure of 190 x 100 Å, with D3 symmetry (a threefold axis with three perpendicular twofold axes) and all the active sites oriented towards a delimited inner space of ~120 x 50 Å (Moreno-Morcillo et al., 2017; Moreno-Morcillo and Ramon-Maiques, 2017).
To complete the model, we proposed that three CPS-2 dimers, similar to the dimers observed in the crystal structures of E. coli CPS and human CPS-1 (Figure 4B,C), could surround the central DHO-ATC assembly, with the twofold axes in the equatorial plane of this globular complex of ~200 x 200 Å (Figure 7C,D).
Despite the symmetrical beauty, this model that we propose could be wrong and needs more structural data to be validated. A detailed characterization of the particle, probably by using high resolution cryo-microscopy, will shed light on the evolutionary advantages of a single multienzymatic protein versus the mono-functional homologs in prokaryotes. Till then, we can say that the proposed model accounts for the channeling of reaction intermediates in the multienzymatic complex (Christopherson and Jones, 1980b; Mally et al., 1980; Otsuki et al., 1982; Penverne et al., 1994). Channeling starts at the GLN domain following the ~90 Å tunnel running through the interior of the SYN domain, allowing the transport of ammonia, carboxyphosphate and carbamate. The CP made in the SYN domain could be delivered directly into the central cavity of the particle where it would have to diffuse a short distance to reach the ATC active site. Within the cavity, the local high concentration of CA-asp, could then favor the displacement of the DHO reaction equilibrium towards the formation of the final product, dihydroorotate.
This arrangement might also explain the mutual influence or “reciprocal allostery”
between the CPS-2 and ATC activities of CAD (Irvine et al., 1997). Binding of allosteric effectors to the L4 domain of CPS-2 could induce changes in the relative orientation of the CPS-2 dimers, and these conformational changes could be transmitted to the rest of the protein by inducing the rotation and translation of the DHO dimers and ATC trimers along their respective symmetry axes. Reversely, conformational changes in the ATC subunits due to substrate or PALA binding (Figure 5C), could be communicated to the outer CPS-2 dimers. This conformational cross-talk between domains could modulate the affinity for the substrates, the rate and the coupling of the reactions, the channeling of intermediates as well as the flux of substrates and products in and out the particle.