III. MATERIALES Y MÉTODOS
3.4. POBLACIÓN Y MUESTRA DEL ESTUDIO
3.4.1. Población
U6
u:
U3 U6® © ®
ATP? a j5 ATPL egend: S ch em e o f pre-m R N A splicing. E xons o f the p re-m R N A are d e p ic te d as boxes, th( in tro n s as a line, the lariat as a sling, and snR N P s as circles ( U l = b lu e; U 2 = re d ; U 4 = green; U 5 = y ello w ; U 6 = cyano). d indicates the d o nor- and a the a c c e p to r sp lice site. T h e req u irem en t o f A T P in e n erg y c o n su m in g steps is indicated. F o r the sake o f sim p lific a tio n , a u x ilia ry factors an d A T P ases, w hich are inv o lv ed in the assem b ly and d isassem b ly o f the sp liceo so m e are e x clu d e d o f the figure. T h e splicing reactio n is ex p lain ed in the te x t (ad ap ted fro m J.P . Stanley an d C. G u th rie, 1998).
The spliceosome is assembled by the stepwise recruitment of a number of snRNPs and non snRNPs in a highly coordinated and carefully regulated fashion. The earliest complex referred to as the commitment complex (CC) is mediated by the binding of U l snRNP to the 5' splice site by base pairing interactions (Wu and Manley, 1989; Zhuang and W einer, 1986). The branch point sequence is recognised by the branch point binding protein (BBP) (Berglund et al., 1997), which is subsequently replaced by U2 snRNP to base pair with the branchsite through RNA-RNA interaction (Black et al., 1985). This results in the formation of a very stable complex (Konarska and Sharp, 1986), which appears to be modulated via a switch between two competing conformations of U2 snRNP (Zavanelli et al., 1994).
The interaction with pre-mRNA alone does not appear to be not sufficient for the specificity o f splice site selection, even though these factors are clearly involved in the earliest steps of the spliceosome formation (Jamison et al., 1992; M ichaud and Reed, 1991; M ichaud and Reed, 1993). The U2 snRNA binding is critically dependent on the presence of an auxiliary factor designated U2AF, which binds to sequences of the polypyrimidine tract near the 3' splice site (Bennett et al., 1992; Ruskin et al., 1988; Zamore and Green, 1989) and utilises both its RNA recognition motifs (RRM) and arginine-serine-rich (RS) - domains to mediate this interaction (Valcarcel et al., 1996). Again another ATPase, UAP56 has only recently been shown to be required for the association of U2 snRNP to the branchpoint sequence (Fleckner et al., 1997) and it was suggested that this protein may interact with S F l, the m ammalian ortholog of BBP (Berglund et al., 1997), thereby promoting its displacement from the pre- mRNA.
The early aggregate, called the A com plex, is subsequently transform s into the B1 spliceosome when the tri-snRNP particle U4-U6-U5 is recruited to the A complex (Chabot et al., 1985), where the exons are brought together. The three RRM containing protein Prp24 facilitates annealing of U4 and U6 snRNP (Ghetti et al., 1995). This activity is enhanced in the presence of other snRNPs (Raghunathan and Guthrie, 1998).
The B2 complex is defined when U4 snRNA dissociates from U6 snRNA and U l snRNA is destabilised at the 5' splice site. U l snRNP is not required any longer in the following reactions (Yean and Lin, 1996), and becomes replaced by U6 snRNP which now interacts at the 5' splice site (Nilsen, 1998). Once released from U4 snRNP, U6 snRNP base pairs with U2 snRNA, resulting in the formation of a U2/U6 helix, which can change its conformation between two isomers. During this step the 5' splice site is moved into close proximity to the branch point.
After completion o f these remodelling steps, the spliceosome, now characterised as the C l complex is competent to execute the first splicing reaction. In the first transestérification reaction the exon is excised at the 5' splice junction and a lariat intermediate is subsequently
formed by the intron which is accompained by an interaction of its 5' end with an adenosine at the branchpoint. The free 3'-hydroxyl end of the 5' exon was recently shown to interact with U5 snRNP and therefore appears to be positioned by U5 into close proximity to the 3' splice site (Newman, 1997), but it is not clear if the 5' splice site or the 3' splice site remains fixed relative to the spliceosome during this rearrangement (Steitz and Steitz, 1993). The first scenario presumes a single catalytic core, similar to the mechanism of group I self-splicing RNAs (Saldanha et al., 1993), whereas the latter assumes two catalytic cores, utilised by group II RNAs (Michel and Ferat, 1995).
Further conformational changes must take place before the second step of splicing which results in the joining of the two exons which carried out by the C2 complex. A provocative hypothesis for this was suggested by (Fabrizio et al., 1997), who just recently identified a new U5 snRNP associated component, which was named U5-116 kDa. This protein-component shows intriguing similarity to the ribosomal GTPase EF-2, which is required for the translocation of the ribosome during tRNA displacement. This raises the speculation that it may be facilitating the rearrangement of the catalytic core after the first splicing step thereby aligning the 5' exon with the 3' splice site, where the second transestérification step occurs (Fabrizio et al., 1997).
Finally, after exon ligation, the C2 complex dissociates into complex I and the mRNA is liberated from the snRNP-lariat complex. The snRNPs subsequently detach from the lariat- RNA which becomes degraded by RNAses, whereas the components of the spliceosome can be recycled for further rounds of splicing reactions. The order of assembly and disassembly is conserved from yeast to mammals as RNA interactions show (Nielsen, 1998) and even applies to a spliceosome that was recently discovered to be involved in the splicing a novel class of introns (Tarn and Steitz, 1996).
Recently, a number of yeast spliceosomal orthologs to mammalian ATPases, namely Prp2, Prp5, Prp9, P rp l6 , Prp22, Prp28, and Brr2, that were identified in cold sensitive splicing defect mutants, have been implicated in assembly and disassembly of snRNPs, suggesting an essential role for this family of DExD/H box proteins in the formation and recycling of spliceosomal components (Arenas and Abelson, 1997; Company et al., 1991; Kim and Lin, 1996; O'Day et al., 1996; Raghunathan and Guthrie, 1998; Teigelkamp et al., 1997; W iest et al., 1996) as well as a function in proofreading mechanisms (Burgess and Guthrie, 1993; Fabrizio et al., 1997). Therefore a major long term goal is to determine the precise role of each of these unwinding and annealing factors in order to fully understand one of the key processes in mammalian cells.
Cis-sequences required for constituitive splice-site recognition
The sequence elements necessary for the recognition of vertebrate splice sites are located in the immediate vicinity of the splice sites, whereas most of the intron sequences appear to be dispensable (Treisman et ah, 1983; W ieringa et ah, 1984; W ieringa et ah, 1983). Initially, it was widely believed that splice sites are recognised by a mechanism scanning the complete mRNA to identify potential binding sites (Lang and Spritz, 1983; Lewin, 1980; Sharp, 1987). This view cannot be reconciled with results indicating that mutation of the 5' splice site (at the 3' end o f the internal exon) inhibits the recognition of the exon (at the 3' splice site of the proceeding intron) despite the presence of valid splicing signals.
The interactions between splicing factors occur therefore across the exon in the first step of splicing (Talerico and Berget, 1990). Another study, using antisense-oligonucleotides directed to a variety of aberrant sequences within the pre-mRNA of thalassémie human (3-globin, revealed that sequences upstream of the branchpoint adenosine and a large part of the exon sequences were required for splicing (Dominski and Kole, 1994). In particular improvement of the 5' splice site of the internal exon appeared to affect interactions at the upstream 3' splice site.
This evidence together with earlier data in which the elongation of exons to more that 300 nucleotides generally inhibited their splicing lead to the foundation of the exon definition model of splicing, which proposes that exons, rather than splice sites are the recognition units for assembly of the spliceosome (Berget, 1995). This model provides a further explanation for the fact that mutations in the 5' splice site usually result in either a) skipping of the upstream exon or b) utilisation of cryptic sites in close proximity to the mutation, but never far downstream within the following intron (Beyer and Osheim, 1988; M itchell et al., 1986; Treisman et al., 1983; W ieringa et al., 1983).
Similarly naturally occurring mutations creating new 5' splice sites within the second intron of human p-globin pre-mRNA result in the activation of normally silent 3' sites upstream of these mutations rather than the production of an aberrant large exon. Mutation o f this cryptic splice site restores the normal splicing pattern (Dobkin and Bank, 1985). It is now generally accepted therefore, that it is the individual exons in mammalian mRNA transcripts rather than the introns that are recognised as splicing units. Once interaction across the exon sequences is established, exons are brought together by bridging across the individual units.
Alternative splicing
A lternative splicing o f prim ary transcripts plays a central role in the control o f gene expression and diversification in higher eucaryotes, leading to the production o f distinct protein isoforms from a single primary transcript (Leff et ah, 1986; Breitbart et ah, 1987; M cKeown, 1992; Rio, 1992; Smith et al., 1989; Green et al., 1991). Products of alternative splicing can be determined by pre-mRNA structures as a consequence of alternative promoter usage (Nabeshima et al., 1984) or alternative poly(A) site usage (Early et al., 1980) and have been observed in different sexes (Nagoshi et al., 1988), in different tissues (Laski et al., 1986; Leff et al., 1987) or in different developmental stages (Breitbart and Nadal Ginard, 1987). The control of tissue-specific alternative splicing has been studied in numerous mammalian genes (M aniatis, 1991; Rio, 1992). One example is U l snRNP binding to a 5' splice site which was shown to affect splicing of an upstream intron in the preprotachykinin gene (Kuo et al., 1991) implicating an interaction across an exon in the process of its exclusion. Skipping of a small exon o f src mRNA in neuronal cells occurs by a m echanism called steric interference (Black, 1991). The differentiation of neurons was shown to be dependent on the alternative splicing pattern of the N-CAM gene, which is important for cell-cell interactions in the nervous system (Tacke and Goridis, 1991) The integrity of the 5' splice site plays an important role in the regulation of this process.
Another model system for the investigation of alternative splicing was provided by the chicken and rat (3-tropomyosin genes, which follow two different pathways in either skeletal muscle or fibroblasts. Specific secondary structures as well as RNA-protein interactions were described to be important for the differential control of this splicing event (Clouet d'Orval et al., 1991; Guo et al., 1991).
Probably one of the best studied system of alternative splicing however, is control o f sexual differentiation in Drosophila (Baker, 1989), which contain several examples o f negative and positive control of splice site choice: The Sex-lethal (Sxl) protein acts in an autoregulatory mode by repressing the use of a male specific 3' splice site in the Sxl transcript itself and in another pre-m RN A of the tra gene (Baker, 1989; Bell et al., 1991). This repression is mediated by the binding of the protein, which contains two consensus RNA binding domains (RBD-CS, described later) (Bandziulis et al., 1989), to the polypyrimidine tracts of the introns (Inoue et al., 1990). The second example is the control of alternative splicing o f the P transposable element in somatic cells (Rio, 1991) resulting in the retention of the third intron (Chain et al., 1991; Siebel et al., 1992; Siebel and Rio, 1990). Another example of the negative control o f a Drosophila SR protein is s u ^ which again autoregulates itself by blocking removal o f one of its introns (Bingham et al., 1988).
Positive control of pre-mRNA splicing has been described for the female specific site of the double-sex (dsx) gene activated by the tra and tra-2 gene products (Baker, 1989) which bind to a regulatory sequence downstream of the 3' exon (Hedley and Maniatis, 1991; Hoshijima et al., 1991; Ryner and Baker, 1991) thereby directing female development. Interestingly both tra and tra-2 proteins were shown to be alternatively spliced themselves in a tissue specific manner and to contain the common RBD-CS binding domain as well as an arginine-serine rich (RS) domain (Amrein et al., 1988; Mattox and Baker, 1991). The m echanism of action involves stabilisation of U2AF binding to the weak polypyrimidine tract o f the dsx female specific intron (Zamore and Green, 1991).
Cis-acting elements affecting splice site selection
Cis-elements effecting alternative splicing have been identified a while ago (Cooper et al., 1988; Kakizuka et al., 1988; Mardon et al., 1987; Reed and Maniatis, 1986; Somasekhar and Mertz, 1985), but regulation of alternative splicing has only been thoroughly investigated more recently in a number of genes (Guo et al., 1991; Kornblihtt et al., 1996; Libri et al., 1992; Delsert et al., 1989; Streuli and Saito, 1989). The majority of these cis-sequences were found within alternatively spliced exons that often contain suboptimal constitutive splice sites to increase their utilisation.
One of the best described cis-elements is the purine-rich exon-recognition-elem ent (ERE) which was shown to be important for alternative splice site selection (Cooper, 1992; Graham et al., 1992; Sun et al., 1993; Watakabe et al., 1993; Xu et al., 1993). Several distinct purine- rich sequences have been demonstrated unequivocally in vivo and in vitro to promote the inclusion of the normally weak residential exon (Humphrey et al., 1995; Ramchatesingh et al., 1995; Tanaka et al., 1994; Yeakley et al., 1993). Normally the splicing of the upstream intron is activated, but one exception to this rule has been reported for the chicken cardiac troponin T pre-mRNA, in which a 30-nucleotide purine-rich ERE caused effective usage o f the exon- terminal 5' splice site (Elrick et al., 1998).
However, the purine-rich region on its own does not seem to be sufficient for the whole enhancer activity as previous studies using mutagenesis as well as synthetic polypurine constructs indicate (Staknis and Reed, 1994; Tanaka et al., 1994). The mechanism by which these sequences enhance splicing still needs further investigation. However, there is strong evidence that EREs interact with protein splicing factors, predominantly with members of the family of SR proteins (Lavigueur et al., 1993; Staknis and Reed, 1994; Sun et al., 1993; Tacke and M anley, 1995) and this process appears to be mediated in cooperation with auxiliary factors (Yeakley et al., 1996).
Apart from important sequences within exons a number of intonic sequences have been reported to influence splice site selection. In particular the polypyrimidine stretch located between the branch site and the 3' splice site junction was shown to interact with components of the splicing machinery, including U2AF65, which is involved in the recruitment of U2 snRNP to the branch site (Ruskin et al., 1988; Zamore and Green, 1989). Intronic elements other than those specifying splice site signals have been described in neuron specific n-src gene to determine tissue specific splicing in vitro and in vivo (Black, 1992; Black, 1991). In addition, variations in the overall intron length had profound effects on the splice site choice, since shorter introns generally lead to the inclusion of the following exon (Peterson and Perry, 1986) Bell et al., submitted). Recently, progess has been made in the identification o f splicing factors that specifically interact with these sequences and the underlying mechanisms begin to become untangled.
Members of the SR protein family of pre-mRNA splicing factors
A central question in the study of alternative splicing mechanisms is how the splice sites are chosen for the proceeding splicing reaction. Considerable advances were made when several splicing factors that play a critical role in this process were discovered by biochemical studies of mammalian splicing. These proteins share a characteristic C terminal serine-arginine rich domain and are known collectively as the SR protein family. This family is rem arkably conserved, in particular between RNA recognition motives (RRMs) and the most common feature, the C terminal RS domain of variable length. Sim ilar domains have also been described for other splicing factors including the U I-70K polypeptide, both the small and the large subunit of U2AF and the Drosophila splicing factors Tra, Tra2 and SWAP (Birney et al., 1993). Several members of this family have been cloned by different groups.
The first SR protein, ASF was purified from splicing extracts and was required to alter splice site selection in an SV40 early pre-mRNA (Ge and Manley, 1990). SF2 however was shown to alter the alternative splicing pattern of a p-globin pre-mRNA in vitro (K rainer and Maniatis, 1985), and subsequently isolated as a factor that is both essential for constitutive splicing and also active in alternative splicing (Krainer et al., 1990a; Krainer et al., 1990b). Later two identical cDNA sequences were isolated for both ASF and SF2, so that this factor is now referred to as ASF/SF2 (Ge et al., 1991; Krainer et al., 1991). Sequence analysis of ASF/SF2 revealed that two important features are shared with D rosophila sex regulatory factors Tra (Boggs et al., 1987) Tra2 (Amrein et al., 1988; Goralski et al., 1989), and su(w^) (Chou et al., 1987), a common RNA recognition m otif (RRM) and a serine arginine rich domain (RS domain), suggesting that this protein may be a member of an evolutionarily conserved family of splicing factors.
Since the discovery o f SF2/ASF a large number of SR proteins have been identified by im m unoprécipitation with m onoclonal antibodies, homology cloning and PCR related methods. SC35, another protein required for spliceosome assembly was identified in size fractionated mammalian spliceosomes by immunoprécipitation studies (Fu and M aniatis,
1990) and was found to be localised in speckled regions of the nucleoplasm (Spector et ah, 1991; Fu and Maniatis, 1992). Its cDNA sequence was later determined by cloning a gene encoded by the opposite strand of the trans-spliced c-myb exon (Vellard et al., 1992). SRp20 represents the human homologue of X 16 in the mouse and R BPl in Drosophila (Ayane et al., 1991; Kim et al., 1992; Zahler et al., 1992), SRp 75 (Zahler et al., 1992), 9G8 (Cavaloc et al., 1994), SRp40 is the human homologue of rat HRS (Birney et al., 1993; Diamond et al., 1993; Screaton et al., 1995) SRp55 is the human homologue of Drosophila SRp55/B52 and SRp30c, a human SR protein resembling ASF/SF2 though, with an uncommonly short RS domain (Screaton et al., 1995).
M ost of the SR proteins that were cloned recently were isolated by RT-PCR from various human cell lines using degenerate primers for common sequence motives. The isolation of cDNAs that encode the most abundant SR proteins has revealed identity with most previously identified SR proteins. The size of the expressed proteins provide a basis for a common nom enclature o f SR proteins, which will be used from now: SRp20 (X I6), SRp30a (ASF/SF2), SRp30b (SC35), SRp30c, SRp30d (9G8), SRp40, SRp55, SRp75 (table 1.1). Soon it became obvious that the family of SR proteins can be divided into two groups, one containing only a canonical RRM, o f which SRp30b is the prototype and a second that contains two, a canonical and an additional central atypical RRM, separated by a glycine-rich hinge region, of which the prototype is SRp30a (Birney et al., 1993). The human SR proteins