• No se han encontrado resultados

5.2 Trabajo Realizado

5.2.2 Permiso de Vertimientos.

2.3.7.1 Pathway analysis

Ingenuity Pathway Analysis, IPA (IPA Version 9.0, Ingenuity Systems, Inc., Redwood City, California, USA; www.ingenuity.com), was used to perform network and pathway analyses. The full Limma-generated gene lists for each treatment comparison were uploaded into IPA.

Canonical pathways analysis identified the pathways from the IPA library of canonical pathways that were most significant to the dataset. Molecules from the dataset that met the absolute fold change cutoff of 1.5 and were associated with a canonical pathway in

Ingenuity’s Knowledge Base were considered for the analysis. The significance of the association between the dataset and the canonical pathway was measured in two ways: 1) a ratio of the number of molecules from the dataset that map to the pathway divided by the total number of molecules that map to the canonical pathway is displayed. Pathways were considered to be significant when at least 10% of the genes from a particular pathway were differentially expressed in the microarray dataset; and 2)

Chapter 2: Methods

85

genes in the dataset and the canonical pathway is explained by chance alone. A probability less than 0.05 was considered significant.

Upload settings for datasets were “flexible format”, “all identifiers”, FC and FDR as “observation” and Agilent probe name as “identifier”, and array platform was set as “whole mouse genome 4 x 44K”. Core analysis settings were: direct and indirect

relationships, endogenous chemicals included, 140 molecules per network, 25 networks

per analysis, data sources, species and tissues and cell lines were each set to “all”, and confidence was set as “experimentally observed”. Duplicates were resolved using

median fold change, and molecules coloured using FC. The number of network eligible molecules was calculated using the following cutoffs: absolute FC greater than 1.5 and FDR less than 0.05.

Datasets were examined for top functions and disease processes, as well as top canonical pathways. Pathways of interest were overlaid with expression values from each dataset for visual comparison. It was noted during analysis that when comparing changes between two datasets, the intensity of expression shown in diagrams of canonical pathways in IPA are relative to the set of expression values in that comparison. Hence, while the colour intensity of each set of pathways are relevant to that one dataset, they may not be directly comparable between datasets. Figure 2.4 provides an example of the type diagram that can be produced using IPA to examine gene networks.

2.3.7.2 Over-representation analysis

The Expression Analysis Systematic Explorer (EASE) version 2.0 tool was developed to automate the process of biological theme determination for lists of genes [314]. Lists of significantly differentially expressed genes (FDR < 0.05, absolute biological FC > 1.5) for each treatment comparison were uploaded into EASE. The three basic functions of EASE are: theme discover; customisable linking to online tools; and the creation of descriptive annotation tables. EASE software uses the gene ontologies or the gene associations provided by Gene Ontology (GO) Consortium members to perform a hypergeometric test on a list of genes to see whether the list of genes is significantly over-represented among GO terms compared to all genes [314].

Chapter 2: Methods

86

Figure 2.4 Example of a network diagram generated in IPA. *denotes genes that are

detected two or more times on an array. Genes or gene products are represented as nodes, and the biological relationship between two nodes is represented as a line. All relationships are supported by at least one reference from literature. Red and green coloured nodes indicate degree of fold change, with red indicating an increase in expression and green indicating a decrease in expression. Colour intensity is correlated with the degree of change in expression with greater intensity representing a higher expression level. Nodes are displayed with various shapes that represent the functional class of genes.

Chapter 2: Methods

87

EASE calculates over-representation with respect to the total number of genes assayed, and gene identifiers are converted such that a single gene represented by more than one

identifier receives only one “vote” for each of the categories it belongs to. Two

statistical measures may be used: the one-tailed Fisher exact probability or a variant of this called the EASE score. The EASE score is recommended because it is a conservative adjustment to the Fisher exact probability and therefore favours more robust categories by penalising the significance of categories supported by fewer genes. EASE calculates a wide variety of probability corrections, including Bonferroni-type methods, FDR and bootstrap methods [314]. EASE scores less than 0.05 were considered significant.

2.3.7.3 Gene set enrichment analysis

Note: GSEA was performed by Dr Wayne Young (AgResearch Grasslands).

Gene set enrichment analysis (GSEA), unlike the analysis methods described previously, analyses entire gene sets, rather than genes which have been determined already to be differentially expressed. GSEA examines the probability value ranking of all genes in a dataset (the entire genome in microarray experiments) to determine whether they belong to a biologically meaningful set such as a KEGG pathway, i.e. they have probability (P value) ranks that are not randomly distributed. GSEA moves away from a gene-centric view, where genes are considered significantly differentially expressed (or not) based on probability, to a method where genes are still ranked according to probability, but what is important is how a set of genes within a biologically meaningful pathway behave [315]. GSEA was performed using the Bioconductor R package (R 2.12.1; R Foundation for Statistical Computing, Vienna, Austria) on the normalised data. Only pathways with probability less than 0.05 were considered significant.

Documento similar