The alignment of 13 G3 SCR sequences (Figure 5.1, 5.6) shows that its consensus length is well-defined as 62 residues with no gaps or insertions. Residue conservation among the 13 sequences is high, with 30/62 residues that are conserved or conservatively replaced at greater than 90%, and 21/62 residues that are 100% conserved. Four of the latter are cysteines (Cys5, Cys34, Cys48 and Cys61), six are hydrophobic (Val3, Tyrl4, Tyr32, Ile46, Trp54 and Ile59), three are prolines (Pro8, Pro9 and Pro57), four are glycines (Glyl, Gly6, Glyl8 and Gly37), and two are arginines (Arg31 and Arg47). The previous analysis fi*om an alignment of 101 SCR sequences (Perkins et al., 1988) showed that 22/61 residue positions were well-conserved, of which 5/61 were well-conserved at greater than 90%. Figure 5.6 shows that 15 of these conserved residues are well matched to the equivalent conserved residues in G3. All 15 matches are hydrophobic, and indicate that the G3 SCR has a well-conserved hydrophobic core in common with other members of the SCR superfamily.
The G3 SCR sequences differ from a typical SCR in that five basic residues occur that are highly conserved. These are Arg/Lys21, Arg/Lys23, Arg31, Arg/His41 and Arg47 (Figure 5.6). It was also of interest to note that the C-terminal sequences following the G3 SCR are also strongly basic. These sequences in Figure 5.6 contain 457 residues of which 116 (25%) are Lys or Arg. By comparison, 17% of residues in the G3 SCR sequences are Lys or Arg, while a typical protein would contain 5.7% Lys and 5.7% Arg residues (Creighton, 1993).
Figure 5.6: [Overleaf] Comparison of the G3 SCR sequences with those o f SCR-15
Figure 5.6: [Below]
Comparison of the G3 SCR sequences with those of SCR-15 and SCR-16 in human factor H. The G3 sequences are identified by their SWISSPROT or FIR accession names or numbers. Residues conserved in over 90% of the G3 SCR sequences are indicated below the alignment. These are compared with the 22 best-conserved residues in an alignment of 101 SCR sequences (Table 1 fi’om Perkins et a l, 1988). The SCR structures are indicated by their Brookhaven codes Ihfi, Ihcc and Ihfh. Secondary structure analyses by DSSP follows the abbreviations given in Figure 5.3. The six p-strands fi'om the averaged structure predictions (Perkins et a l, 1988) and the averaged DSSP analyses are indicated by B l to B6. Side-chain accessibility analyses by COMPARER follows those in Figure 5.4, and are compared with the averaged hydropathy predictions of Perkins et al. (1988).
G3 SCR SEQUENCES Residue numbering
PGCA_HUMAN (Aggrecan: human PGCA_RAT (Aggrecan: rat PGCA_BOVIN (Aggrecan: bovine A55182 (Aggrecan: mouse S39796 (Aggrecan: chick A54423 (Brevican: bovine S49126 (Brevican: rat 886890 (Brevican: mouse S28764 (Neurocan: rat S52781 (Neurocan: mouse PGCS_HUMAN (Versican: human 862461 (Versican: mouse A47171 (Versican: chick
>90% CONSERVED (13 G3 SEQ) Matches (15/21)
>40% CONSERVED (101 SCR SEQ)
ihfi seguence (Factor H SCR-15) lhfh-1 sequence (Factor H SCR-15) Ihcc sequence (Factor H SCR-16) lhfh-2 sequence (Factor H SCR-16)
10 20 30 40 50 60 <--- C-terminus--- >
GTVACGE P P W E H A R T FGQK K D R ..YEINSLVRYQCTEGFVQRHMPTIRCQPSGHWEEPRITCTDATTYKRRLQKRSSRHPRRSRPSTAH
GT VACGEP PAVE H A R T LGQKKDR..YEISSLVRYQCTEGFVQRHVPTIRCQPSADWEEPRITCTDPNTYKHRLQKRTMRPTRRSRPSMAH
G T V A C G E P P W E B A R I FGQKKDR..YEINALVRYQCTEGFIQGHVPTIRCQPSGHWEEPRITCTDPATYKRRLQKRSSRPLRRSHPSTAH G T V A C G D P P W E H A R T L G Q K K D R ..YEISSLVRYQCTEGFVQRHVPTIRCQPSGHWEEPRITCTDPNTYKHRLQKRTMRPTRRSRPSMAH G T V A C G D P PWENARTFGRKKDR. . YEINSLVRYQCDHGYIQRHVPTIRCQPNGHWEEPRISCTNPSSYQRRLYKRSPRSRLRPGWHRPTH GLVSCGPPPELPLAEVFGRPRLR..YEVDTVLRYRCREGLTQRNLPLIRCQENGRWGLPQISCVPRRPARALRPVEAQEGRPWRLVGHWKARLNPSPNPAPGP GLVSCGPPPQLPLAQIFGRPRLR..YAVDTVLRYRCRDGLAQRNLPLIRCQENGLWEAPQISCVPRRPARALRSMTAPEGPRGQLPRQRKALLTPPSSL GLVSCGPPPQLPLAQIFGRPRLR..YAVDTVLRYRCRDGLAQRNLPLIRCQENGLWEAPQISCVPRRPGRALRSMDAPEGPRGQLSRHRKAPLTPPSSL GTVLCGPPPAVENASLVGVRKVK. . YNVHATVRYQCDEGFSQHHVATIRCRSNGKWDRPQIVCTKPRRSHRMRRHHHHPHRHHKPRKEHRKHKRHPAEDWEKDEGDFC GTVLCG P P PAVENAS L V G V R K I K ..YNVHATVRYQCDEGFSQHRVATIRCRNNGKWDRPQIMCIKPRRSHRMRRHHHHPHRHHKPRKEHRKHKRHPAEDWEKDEGDFC G T V A C G Q P P W E N A K T F G K M K P R . .YEINSLIRYHCKDGFIQRHLPTIRCLGNGRWAIPKITCMNPSAYQRTYSMKY FKNSSSAKDNSINTSKHDHRWSRRWQESRR G T V A C G Q P P W E N A K T F G K M K P R . .YEINSLIRYHCKDGFIQRHLPTIRCLGNGRWAMPKITCMNPSAYQRTYSKKY LKNSSSAKDNSINTSKHEHRWSRR QETRR G T V A C G Q P P W E N A K T F G K M K P R . .YEINSLIRYHCKDGFIQRHIPTIRCQGNGRWDMPKITCMNPSTYQRTYSKKYYYKHSSSGKGTSLNSSKHYHRWIRTWQDSRR G V C G P P I A G R R Y I VRY C EG QR V IRC G W P I C I I I I I I I I I I I I I I I C P P I N G I Y G E V Y C G Y G I C G W P C EKIPCSQPPQIEHGTINSSRSSQESYAHGTKLSYTCEGGFRISEENETTCYM.G K W S S P .PQCE EKIPCSQPPQIEHGTINSSRSSQESYAHGTKLSYTCEGGFRISEENETTCYM.GKWSSP.PQCE E G L P C K S P P EISHGWAHMSDS. . . YQYGEEVTYKCFEGFGIDGPAIAKCLG. E K W S H P . PSCI GLPCKSPP E I S H G W A H M S D S . . . YQYGEEVTYKCFEGFGI DGPAIAKCLG. E K W S H P . PSCI
PREDICTED SEC STRUCTURE (1988) Matches (41/62)
OBSERVED SEC STRUCTURE
DSSP Ihfi (Factor H SCR-15) DSSP lhfh-1 (Factor H SCR-15) DSSP Ihcc (Factor H SCR-16) DSSP lhfh-2 (Factor H SCR-16) < ..Bl> <-B2> <B3> <-B4-> . <B5. <B6> I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I <B1 ... <-B2-> B3 <-B4> . B5 . B6
EEs BttBEE s sss sEE tt EEEEE tt B ss EEEEEt.tEE . B B s ttEEE sssss s B ttEEEEEEEsss EEEs sEEEEEt.tEE . EEE
s BttEEES s ... ss EE EEE tts EEs s EEEEt.tEEE . EE B s ss EEEE s ... Bss EEEEE tt B s EEEB s.ssB s . sB
PREDICTED HYDROPATHY (1988) Matches (48/62)
OBSERVED ACCESSIBILITY
COMPARER Ihfi (Factor H SCR-15) COMPARER lhfh-1 (Factor H SCR-15) COMPARER Ihcc (Factor H SCR-16) COMPARER lhfh-2 (Factor H SCR-16) ^>ebbbebeebebebeeeee. .bebbeâ>ebebeébbéb^>eeebd3eeebebeeebbebeee I I I I I I I I I I I I I I I I I I I l l l l l l l l l I I I I I I I I I I I I I I I I I I I I 97 670960391980604 399987 96085929180705994 917 997 4 05093.0907 92.1708 9686097028052090529 9977 8607 4 92 9160603990928 996304 084.1907 90.0802 959609907 90880607 9 98 99...098947030909994956974408072.590337.0709 017 09602 919304 0695999...26464 507 0828 995996954 206030.37 0394.0006 139