• No se han encontrado resultados

In order to assess the extent of SCP1 presence outside of the chordates, SCP1 protein sequences were collected as described in section 4.2. for a wide range of organisms across the metazoa. This list of SCP1 proteins was built both with the aim of being exhaustive and to sample phyla that were underrepresented in previous analyses (Fraune et al., 2012). The chimaera (Callorhinchus milii) was added to the vertebrata, as a basal fish lineage, as well as additional echinoderm species, including a second echinoid (Lytechinus variegatus) and a member of the less divergent asteroids (Asterias amurensis). Additionally, a single hemichordate SCP1 sequence from (Saccoglossus kowalevskii) was identified, giving examples of SCP1 from all three main deuterostome phyla. In the Protostomia, an additional lophotrochozoan sequence was obtained from the Mollusca (Pomacea canaliculata), though no additional ecdysozoan members were obtained (beyond the highly divergent and short Petrolisthes cinctipes sequence). No additional members of the Cnidaria were obtained beyond Hydra vulgaris and the short Nematostella vectensis SCP1 EST. Finally, the poriferan Amphimedon queenslandica and the Ctenophore Pleurobrachia pileus represent the sole examples of SCP1 so far identified in these phyla.Full species names, groups and accession numbers are given in Appendix 7.6.

Alignment of all identified SCP1 sequences was carried out using CLUSTAL Omega (Sievers et al., 2011) and visualised in Jalview (Waterhouse et al., 2009). The full SCP1 alignment can be found in Appendix 7.7 (figure 7.3). An 83aa motif in the N-terminus has previously been observed to be conserved between hydra and vertebrates (Fraune et al., 2012), and this was seen to be highly conserved across all species examined in this study (figure 4.6). This domain lies within the N- terminal coiled-coil domain and is conserved between rat and hydra (Fraune et al., 2012). However, a few species stand out from this analysis as being divergent. One of these is Petrolisthes, which stands out as the only example of SCP1 genes within the Ecdysozoa. Of the two ESTs identified in Fraune et al. (2012), one was not included in this study due to its highly divergent CM1 motif. The remaining Petrolisthes sequence included in this study remains divergent, even compared with the cnidarian, poriferan and ctenophoran sequences. It is possible that these are not in fact SCP1 genes, or represent contamination, as other Ecdysozoans have evolved non-homologous genes to fulfil the role of SCP1 . Capitella SCP1 is another sequence that shows divergence, this time having a deletion of the C-terminal end of the CM1 motif, though further C-terminal sequence picks up again shortly after. This divergence may be an artefact of protein prediction from the genomic shotgun sequence. Finally, Nematostella has a very short EST read providing its SCP1 representative and lacks the C- terminal end of the CM1 motif as well as any further C-terminal peptide sequence. Despite these anomalies, across the CM1 motif a 70.5% similarity can be seen between rat and Pleurobrachia, with

129 44.3% identity, showing that this CM1 motif has maintained high conservation across metazoan evolution. An alignment of this CM1 motif is visualised in figure 4.6.

In order to test the phylogenetic relationships of SCP1 proteins, both neighbour joining (NJ) and maximum likelihood (ML) trees were built using the 83aa CM1 conserved motif. The coiled-coil domain containing CCDC39 proteins from Drosophila, Strongylocentrotus and human were used as an outgroup to build the phylogenies, using alignment with the 83aa CM1 motif coiled-coil domain. Phylogenetic models were tested using ProtTest (Abascal et al., 2005). Neighbour joining tree was built using the Poisson model with 1000 bootstraps, and a Maximum likelihood was tree built using the LG+G model with 1000 bootstraps. Capitella was removed from the analysis due to the large expanse of missing C-terminal CM1 motif.

Though bootstrap support values are low on many branches, both NJ (figure 4.7 A) and ML (figure 4.7 B) analyses have similar topologies, although the topology does have a good match to the known relationships of the taxa. The vertebrates group together with significant support, as do the different vertebrate groups such as mammals, fish and lizards/birds. Amphioxus SCP1 consistently groups with the hemichordate Saccoglossus rather than the vertebrate chordates, with the asteroid Asterias branching further down, making an interesting grouping with regards to evolutionary relationships of the echinoderms, hemichordates and chordates, though this relationship has no support. The tunicates appear to group with the echinoids, though this is a very long branch length and has no support. This grouping could perhaps represent the divergent evolution of both

echinoids and tunicates. Strangely, the cnidarian Nematostella groups with the lophotrochozoan clade, again with no report. This could be due to the short transcript of Nematostella and perhaps loss of phylogenetic signal. Both Pleurobrachia and Amphimedon branch basal to the other Metazoa as expected, though with long branch lengths and very low bootstrap support. Petrolisthes

consistently groups basal to all lineages other than Pleurobrachia and Amphimedon, including the Cnidaria. This may either represent several different scenarios; either an extremely divergent SCP1 in Petrolisthes, that this short sequence is perhaps not ecdysozoan and in fact represents some contamination, or that this Petrolisthes sequence is actually not an SCP1 homologue. It should be pointed out that conclusions about phylogenetic grouping can only really be made for groups with significance support (>70%). In addition, taxa with long branch lengths/divergent sequences being problematic to place reliably.

130 Figure 4.6. The SCP1 protein CM1 motif is conserved across the Metazoa.

A CLUSTAL Omega protein multiple alignment of the CM1 domains of SCP1 shows a high level of conservation across an 83 aa motif across the metazoan species examined. Conservation is visualised with false colour using the ClustalX colour table for amino acids. Conservation is given below as a score out of 10 across all aligned sequences in yellow-brown. The same is given for the quality of alignment, represented by the sequence similarity. Finally a consensus sequence made up of the most abundant amino acid for each position is given in black. Names of species used are given to the left of the alignment. Numbers in parenthesis indicate the position of the CM1 motif amino acids within the native peptide sequence.

131 Figure 4.7. Phylogeny of metazoan SCP1 proteins

(A) Neighbour joining tree built using the 83aa CM1 domain of SCP1 proteins, using the POISSON matrix and 1000 bootstraps. (B) Maximum likelihood tree built using the 83aa CM1 domain of SCP1 proteins, using the LG+G model with 1000 bootstraps. CCDC39 Proteins were used as an outgroup to SCP1. Bootstrap values over 50% are given. Longer branch lengths equate to a further evolutionary distance between nodes. Trees were built using MEGA6.

132 4.4. Discussion