Estadio IV: el tumor se ha diseminado a los ganglios linfáticos o a otras partes del organismo, como los huesos.
PROGRAMACIÓN DE EJERCICIOS ESSCOLAP ADVANCED
3.5.1 Co-evolution Module Construction
Modules of co-evolving grapevine genes were constructed separately assum- ing three models of evolution, namely evolution by gene duplication (gene family correlation modules), evolution by gene expression regulation (gene co-expression modules) and evolution by point mutations (Evolutionary Rate Covariation modules). A summary of the workow used to construct these modules is shown in Figure 3.1.
3.5.1.1 Gene Family Correlation Modules
26 translated plant genomes were obtained, of which 25 (including grapevine) were downloaded from Plaza (version 2.5) [51] and the potato genome was downloaded from the Solanaceae Genomics Resource
[http://solanaceae.plantbiology.msu.edu/]. Gene families were constructed across these 26 plant genomes using our newly developed Parallel-OrthoMCL (de- scribed elsewhere), a parallel version of the OrthoMCL software package [21], allowing gene families to be identied across much larger sets of genomes. A species-family matrix (SF-matrix) was constructed in which the columns rep- resented plant species and rows represented gene families, such that entry ij was the number of genes in gene family i present in species j. Gene families that were correlated across species were then determined by calculating the Pearson correlation coecient between all pairs of rows of the SF-matrix using mcxarray [52] and applying an absolute threshold of 0.8. Modules of corre- lated gene families were then constructed by clustering the resulting thresh- olded network using MCL [52]. These modules were subsequently pruned to remove all non-grapevine genes, resulting in modules of correlated grapevine gene families.
3.5.1.2 Gene Co-expression Modules
472 microarray experiments using the Grapevine Aymetrix Genechip were downloaded from Gene Expression Omnibus and processed using RMA [53]. Co-expression was then calculated as the Pearson correlation coecient be- tween the expression proles of the probes [52]. An absolute threshold of 0.8 was applied and the resulting thresholded network was clustered using MCL [52] in order to create co-expression modules. Probes were mapped to their
corresponding genes in order to produce modules of co-expressed grapevine genes.
3.5.1.3 Evolutionary Rate Covariation Modules
Modules of genes with similar evolutionary rates were constructed using the Mirror-Tree method adapted with the projection operator [12]. A `minimal gene family' was constructed around each grapevine gene by selecting the best ortholog or co-ortholog of that gene in each of the 26 plant species used in Parallel-OrthoMCL. A minimum family size threshold of 5 was applied. The amino acid sequences of the resulting gene families were aligned using MUSCLE [54]. The evolutionary distances between genes within each family were calculated using the ProtDist program from the PHYLIP package [55], resulting in a distance matrix for each grapevine gene. The distance matrices
were then unfolded into phylogenetic vectors vi. All phylogenetic vectors were
then normalised by their standard deviation, and the average phylogenetic vector was calculated as:
vav = 1 m m X i=1 vs i ||vs i|| (3.5.1) where vs
i is a phylogenetic vector normalised by standard deviation and ||.|| is
the euclidean norm [12]. The projection operator in equation 3.5.2 was applied to each of the original phylogenetic vectors.
i = vi− vavhvi, vavi (3.5.2)
The evolutionary rate covariation between grapevine genes was then calcu- lated as the Pearson correlation coecient between all pairs of the projected
phylogenetic vectors i using mcxarray [52]. An absolute threshold of 0.9 was
applied, after which the thresholded network was clustered using MCL [52]. The resulting clusters represent modules of grapevine genes with similar evo- lutionary rate covariation signatures.
3.5.2 Module Overlap and GO Enrichment
For each pair of evolutionary mechanisms, the module overlap was calculated between all pairs of modules using the Jaccard Index. For two sets A and B,
the Jaccard Index JAB is dened as the size of the intersection of the two sets,
divided by the size of the union of the two sets (Equation 3.5.3).
JAB =
|A ∩ B|
|A ∪ B| (3.5.3)
In the case of overlaps between ERC modules and gene family modules, inpar- alogs were excluded from intersections. This resulted in an overlap matrix for
each pair of evolutionary mechanisms, in which the columns represented co- evolutionary modules from one mechanism, rows represented co-evolutionary modules from another mechanism, and entry ij represented the jaccard overlap between modules i and j. The right tailed Fisher exact test was used to iden- tify signicant module overlaps. This was performed using a customized Perl program which made use of the Text::NSP::Measures::2D::Fisher Perl mod- ule from CPAN (http://www.cpan.org/). When testing the null hypothesis "module i is not enriched in module j, the p-value was calculated as:
p = R x T −R C−x T C (3.5.4)
where x is entry ij, R is the sum of row i, C is the sum of column j and T is the sum of all entries in the matrix. The Holm-Bonferroni method was used for multiple hypothesis correction [56]. An intersection cut-o of 2 was the applied. Networks were then constructed from the signicant module overlaps and visualized in Cytoscape [23]. A combined co-evolution network was con- structed by merging the three co-evolution networks constructed for pairs of evolutionary mechanisms. GO terms for the grapevine genes were downloaded from Plaza (version 2.5) [51] and mapped onto their corresponding modules. GO enrichment was performed on subsets of genes in local neighbourhoods of the combined co-evolution network using GOEAST [25]. For each subnetwork in question, two sets of genes were extracted using a customized Perl script, specically, the genes present in the nodes as well as the genes present in the edges. GO term enrichment was performed on each of these sets of genes, called node enrichment and edge enrichment, respectively [25]. MultiGOEAST was used to compare the GOEAST results from the node enrichment and edge enrichment views of a subnetwork.
3.5.3 Module-GO-Term Network Construction
A GO-term network was constructed to investigate the functions present in the nodes and edges linking these subnetworks 1, 2 and 4. (Figure S21). Each module was connected to nodes representing GO terms associated with that module. GO terms were also linked if they were within a distance of 2 from each other in the GO hierarchy.
3.5.4 Gene-GO-Term Network Construction
The main functions linking subnetworks 1, 2 and 4 were selected as the cen- tral GO term nodes in the Module-GO-Term network (Figure S21), namely response to salt stress, cellulose biosynthetic process, defense response to bac- terium, response to wounding, response to jasmonic acid stimulus, response to abscisic acid stimulus, apoptotic process and defense response. Genes which
were present in subnetworks 1, 2 and 4 which were annotated with at least 2 of these terms were then selected. A network was constructed in which each node represented either a gene (purple nodes) or GO-terms (light green nodes) (Figure 3.4). GO-terms were connected to genes annotated with that term, and GO-terms were also linked if they were within a distance of 2 from each other in the GO hierarchy. Grapevine InterPro annotations as well as the Arabidopsis gene descriptions were downloaded from Plaza (version 2.5) [51]. Grapevine genes were annotated with the descriptions of their Arabidopsis or- thologs, as determined by Parallel-OrthoMCL. The genes in this network were then assigned InterPro annotations and gene descriptions where possible.