• No se han encontrado resultados

CAPITULO III TRATATO DE LIBRE COMERCIO CON AMERICA DEL NORTE

3.12. MEDIDAS DE EMERGENCIA

The clustering analysis using OrthoMCL assigned 113,129 genes from 10 genomes and transcriptomes into 21,700 orthologous groups, with an additional 39,631 sequences excluded from the analysis. The results are summarised in Figure 4.3.1. For the full table of results see S4.3.1.

It is assumed that the ‘unclustered’ sequences were excluded from the analysis because they have no orthologues or paralogues in any of the genomes included. Therefore, they have been labelled single copy species-specific sequences. OrthoMCL assigned 6,782 sequences from the P. lacertae genome, 9,273 from the C. roenbergensis transcriptome and 1,084, 894 and 1,039 from Blastocystis sp. ST1, ST4 and ST7 respectively to the unclustered group (Figure 4.3.1).

The ‘conserved’ sequences that feature in the non-overlapping sections of Figure 4.3.1 denote sequences from orthogroups that contain a representative sequence from one of the P. lacertae, C. roenbergensis or Blastocystis sp.

genomes, and another Stramenopile. For example, orthogroup MCL1012 contains 187 sequences; 51 from P. lacertae, 66 from P. sojae, 42 from P. ultimum, 23 from S. diclina, 4 from E. siliculosus and 1 from T. pseudonana but none from any Blastocystis sp. STs or C. roenbergensis. Thus, these sequences are conserved throughout the Stramenopiles but do not overlap with the closest relatives of P.

lacertae (Figure 4.3.1).

OrthoMCL also assigned a number of multi-copy species-specific groups containing 13,642 sequences for P. lacertae, 6,291 for C. roenbergensis and 1,899, 1,647 and 1,549 for Blastocystis sp. ST1, ST4 and ST7 respectively. These are species-specific and are labelled ‘specific’ in Figure 4.3.1 to differentiate them from

‘conserved ’orthogroups (above). P. lacertae had the largest repertoire of species-specific genes of any genome included in the analysis. The largest of these contained over 800 proteins, though by definition they are specific to P. lacertae and therefore many share little homology to sequences in the database.

There is a core group of 892 conserved orthogroups that are found in at least one Blastocystis genome, the C. roenbergensis transcriptome and the P.

lacertae genome. They may also be present in one or more of the other Stramenopile genomes. 527 of these contain representatives from all the genomes included in the analysis, the largest of which (MCL1021) contains 131 sequences overall, 8 of which are from P. lacertae, 3 are from C. roenbergensis, 8 from Blastocystis sp. ST1, 5 from Blastocystis sp. ST7 and 2 are from Blastocystis sp.

ST4. 134 of these orthogroups represent single-copy orthogroups, containing only a single sequence from each genome.

Figure 4.3.1 Venn diagram showing numbers of shared orthologs between Blastocystis sp. ST7, P. lacertae and C. roenbergensis. Sequences clustered by OrthoMCL. ‘Specific’ refers to sequences from groups containing only one species,

‘conserved’ sequences are those with no orthologs in the other genomes shown here but which do have orthologs in other genomes included in the analysis and

‘unclustered’ sequences are assumed to represent single-copy species-specific proteins.

Blastocystis sp. ST7 appears to share fewer genes with C. roenbergensis than P. lacertae does. This is an interesting result as the phylogeny (Figure 2.3.6) shows that P. lacertae and Blastocystis are equidistant from C. roenbergensis and it may therefore be expected they would share roughly equal number of genes. This also shows that, while the Blastocystis genomes are much smaller than the P.

lacertae genome, the majority of this size difference is accounted for by the large species-specific repertoire of P. lacertae.

Interestingly, there were also substantial differences between the subtypes of Blastocystis, not just in terms of copy number within orthogroups but also in representation between orthogroups. The differences between the Blastocystis sp.

STs species-specific repertoires is summarised in Figure 4.3.2. Only ~1000 Blastocystis-specific genes are shared between all subtypes and each genome appears to have its own repertoire of subtype-specific genes greater than the number of genes shared between all three.

Figure 4.3.2 Venn diagram showing differences in orthologs between Blastocystis-specific sequences. Sequences clustered by OrthoMCL and represent the number sequences from each subtype in Blastocystis-specific orthologous groups.

As the Blastocystis sp. STs have smaller genomes than other Stramenopiles, the origin of those differences was investigated. The largest differences came from species-specific orthogroups in P. lacertae, however, examination of conserved orthogroups showed differences in the number of sequences contributed to conserved groups by Blastocystis sp. STs. To understand these differences, the orthogroups were further broken down into conserved groups with differing conservation profiles (Figure 4.3.3).

‘Lineage specific’ orthogroups were defined as any orthogroup that contained representatives from both P. lacertae and Blastocystis sp. STs only, such as MCL2961 which contains a single sequence from both Blastocystis sp. ST4 and

need be represented. In C. roenbergensis this was defined by including an orthogroup with a C. roenbergensis sequence and at least one other Stramenopile, but absent in P. lacertae and Blastocystis. ‘Core Stramenopile’ orthogroups must contain at least one sequence from each genome such as MCL1021 (above). ‘Lost from Stramenopiles’ are orthogroups which contain sequences from any two genomes of P. lacertae, Blastocystis sp. STs or C. roenbergensis but do not contain a sequence from another Stramenopile such as MCL5023 which contains a single sequence from C. roenbergensis, P. lacertae and Blastocystis sp. ST4 and two sequences each from Blastocystis sp. ST1 and ST7.

The final group, ‘lost from sister species’ are examples of orthogroups with representative sequences from C. roenbergensis or another Stramenopile and either P. lacertae or Blastocystis sp. but not both. While, P. lacertae contains more orthologues than all Blastocystis sp. STs in each of these conserved orthogroups, the difference is most striking in the ‘lost from sister species’ group (Figure 4.3.3).

This group is indicative of cases where it is expected that there would be roughly equal numbers orthologues between P. lacertae and Blastocystis sp. but all Blastocystis subtypes included here appear to have fewer.

Conserved orthogroups were examined and found to routinely contain fewer representative sequences from Blastocystis. These are sequences that are otherwise conserved in the Stramenopiles but are present in lower copy number in all Blastocystis subtypes despite the differences between them.

Figure 4.3.3 Number of sequences contained in different cluster categories assigned by OrthoMCL for multiple Stramenopile genomes. Clusters have been defined as ‘lineage specific’ (only found in a given species), ‘core Stramenopile’ (found in all Stramenopile

From figure 4.3.3 it is clear that the number of conserved sequences in P.

lacertae is consistently higher than any Blastocystis genome, while the number of conserved sequences lost from Blastocystis is substantially smaller than the corresponding number in P. lacertae. This suggests that the Blastocystis sp. STs genomes are missing sequences from highly conserved gene families, common across the large phylogenetic distances within the Stramenopiles. Comparison with P. lacertae demonstrates that the absence of these widely conserved genes is unique to Blastocystis and must have occurred after the separation of Blastocystis and Proteromonas from their common ancestor.