CAPÍTULO 3 METODOLOGÍA Y DESARROLLO
3.5 DESCRIPCIÓN DE LOS EXPERIMENTOS
3.5.1 DISEÑO DE EXPERIMENTOS
4.8.1 NUCLEIC ACID ANALYSIS
The entire sequence of DMDL (UTRN) cDNA is shown in Appendix II. The 5.2Kb of sequence determined during the course of this study is underlined. Computer comparisons
(using UWGCG 'FastA') show that the 3’ end of the human DMDL (UTRN) sequence is highly homologous to dystrophin; Torpedo californica dystrophin (64%) chicken and mouse dystrophin (61%) and human dystrophin (60%). The alignment is most significant between nucleotide numbers 7231 to 9420 which correspond to positions 7620 to 9806 in the human DMD cDNA (Appendix III). If the sequence 5’ to this region, (DMDL (UTRN) positions 4191 to 7241) is compared to dystrophin separately, 59% homology is found to a region considerably further
upstream, corresponding to DMD base pair positions 1534 to 2785. This seems to suggest that there are parts of the DMD rod sequence which do not have a corresponding sequence in
DMDL (UTRN). This is not unexpected since the DMDL (UTRN) transcript is 1Kb shorter than the DMD transcript. However, the interval between this region of homology and the next (DMD
positions 7621 to 9799 and DMDL (UTRN) positions 7231 to 9393) appears to be too long to directly account for the 1Kb difference in length between DMD and DMDL (UTRN) transcripts and detection of this homology level further upstream may also reflect the somewhat repetitive nature of the rod region in dystrophin.
In searching for related genes using nucleic acid sequence homology, DMD appears to be the only gene in the 'genEMBL' database which is related to the region of DMDL (UTRN) cloned and sequenced in this study (positions 4,502 to 9767 in DMD).
The DMDL (UTRN) cDNA sequence obtained from the placental libraries described in this thesis differed by only one basepair from the sequence obtained in this region by the Oxford
laboratory. This mutation of guanine to adenine, does not create a restriction enzyme difference for a commercially available enzyme and it is not yet clear whether this
represents a useful polymorphism.
4.8.1.1 ANALYSIS OF THE UNSTABLE REGION OF DMDL (U T R N )
As discussed in section 4.1 early attempts to clone DMDL (UTRN) cDNAs indicated that part of the sequence was
unstable/unclonable. The BpI X g tll recombinants from library I described in section 4.3 were unstable and insert cDNA could not be subcloned. When Bpl2 was restriction digested with Hindi 11, it was possible to subclone four of five fragments generated (378bp, 333bp, 402bp and 665bp in size) but not the 3' 1.3Kb Hindlll fragment.
In order to suggest possible reasons for the instability of the 3' end of DMDL (UTRN) in cloning vectors the sequence was examined for internal repeats. This was done by comparing the partial nucleic acid sequence to itself using UWGCG 'repeat'. No long or unusual repeat regions were identified in the DMDL (UTRN) sequence. It is also possible to look for inverted repeats using 'UWGCG STEMLOOP'. The results of this search
Hindlll Hindlll Hindlll Hindlll ■S (D 9.2Kb 8.2Kb 4.2Kb 5.2Kb 6.2Kb 7.2Kb Blp2 Bpl32
m
Key 4.18 The position of possibie stemioop forming sequences in the regionsY //A
encompassed by ciones Bpi2 and Bpi32. Bpi2 was unstable in cloningvectors and under-represented in amplified cDNA libraries. Unusual secondaiy structure may be the cause of this. Loops identified using
the program 'StemLoop' UWGCG and are depicted as j .
The shaded box represents a region between positions 7887 and 8690 where there are 9 non-overlapping possible stemioop forming regions.
U $ c
are plotted in Fig. 4.18 and indicate a cluster of possible loop forming sites around the region considered to be unstable. The computer identified 21 non-overlapping sequences with which stem-loop formation was possible. There appears to be a cluster of such sequences around the 3' end of Bpl2 near the region which could not be subcloned either as a 3.2Kb fragment or as a 1.3Kb Hindlll fragment in M13 or pGEM3ZF+ with a
variety of host cells. There are nine consecutive sequences capable of forming stem loops in the region between basepair positions 7887 and 8690, this may lead to some difficulty in maintaining the Bpl2 sequence in a vector.
4.8.2 AMINO ACID ANALYSIS
The 5229bp portion of the DMDL (UTRN) cDNA, cloned during the course of this study and which lies entirely within the 13Kb coding region of DMDL (UTRN) (see Appendix IV), shows a single continuous open reading frame comprising 1742 amino acids. At the amino-acid level, the highest levels of homology detected to dystrophin, using UWGCG 'FastA', correspond to the 5' end of the cysteine rich region of dystrophin. From utrophin amino acid number 2838 to 2958 the level of identity is 77% and the level of similarity (i.e. with conservative
substitutions) 87% (Fig. 4.19 (A)). Homology declines towards the 5' end of the protein, mirroring the pattern of the nucleic acid homology. The 3' 742 amino-acids of utrophin show 52% identity and 71% similarity with dystrophin (Appendix VA) whereas the entire 1742 peptide sequence shows 46% identity and 67% similarity with dystrophin (Appendix VB).
A comparison of human, chicken and Torpedo californica dystrophin with the partial human utrophin peptide sequence
identifies a highly conserved region which covers 399 amino- acid residues (extending to the end of the peptide sequence obtained) Fig. 4.20. This region corresponds to the cysteine rich region of dystrophin and represents an amino acid
sequence conserved across species and across genes (Fig. 4.20).
The cysteine rich region in the human dystrophin protein comprises 5.3% cysteine which compared to other proteins, is
1994 2025
Human FKWSLLRKKS LNIRSHLEAS SDQWKRLHLS LQELLVWLQL KDDELSRQAP
T.cal... S GEQWKRLQIS LQDFLTWMNL KNDELRRQMP
Chick FRWSELRKKS LNIRSHLEAS TDQWKRLHLS LQELLAWLQL KEDELKQQAP
DMDL QRWNDLKAKS ASIRAHLEAS AEKWNRLLMS LEELIKWLNM KDEELKKQMP
(pn^lO) 2538 2587
2026 2076
Human IGGDFPAVQK QNDVHRAFKR ELKTKEPVIM STLETVRIFL TEQPLEGLE.