• No se han encontrado resultados

4. Marco metodológico

5.1. Cuando mi país me dio la espalda

This section summarises the improvements in annotation gained by making use of theMIRIAMURIs already present in the original model. When annotating the stripped model, there were a number of species (e.g. Fus3PP) which had no annotation retrieved as the identifier and name were unsuitable. In these cases, making use of the URIs present in the intact, original model increases annotation coverage. For instance, UniProtKB URIs can unambiguously identify species with uninformative identifiers such as GaGTP and complexC, allowing Saint to suggest appropriate species names,GO

URIs andSBOterms.

Basic information on the annotation of the intact pheromone model by Saint is presented in Table3.7, together with a summary of how the new annotation compares both with the original model and the re-annotated stripped model previously described (see Section 3.5.3.1). A detailed listing of the modifications made to the model is available in Table3.8.

Overall, Saint produced more annotation when starting from the intact model versus the stripped model. The total number of Saint-supported URIs increased from 101 to 325, as compared with 174 URIs added to the stripped version of the model. While the original model did not contain SBO

terms and the re-annotation of the stripped model added only 15, the annotation of the intact model produced a total of 21 terms. Further, while the re-annotation of the stripped model was able to add 11 more species names, a further two names were added when starting with an intact model.

Original model Stripped model annotated by Saint

Original model annotated by Saint

Species with names 10 21 23

SBO terms 0 15 21

Saint-supported URIs 101 174 325

Table 3.7: The amount of annotation added by Saint when annotating the original version of the pheromone model and when annotating its stripped form. More annotation is added by Saint when the original model is annotated as compared to annotating the stripped model.

3.6

Discussion

Saint is a syntactic data integration application for the enhancement of systems biology models. It was developed as an interactive Web tool to annotate models with newMIRIAMresources and reactions, keeping track of data provenance so the modeller can make an informed decision about the

SPECIES ID MATCH CHANGED Old New

GαGTP Guanine nucleotide-binding protein alpha-1 subunit

GaGTP SBO:0000245

urn:miriam:uniprot:P08539

GαGDP Guanine nucleotide-binding protein alpha-1 subunit

GaGDP SBO:0000245

urn:miriam:uniprot:P08539

Mitogen-activated protein kinase FUS3

Fus3PP SBO:0000245

urn:miriam:uniprot:P16892

Bar1activeEx Barrierpepsin

Bar1aex SBO:0000245

urn:miriam:uniprot:P12630

Cyclin-dependent kinase inhibitor FAR1

Far1PP SBO:0000245

urn:miriam:uniprot:P21268

Far1ubiquitin Cyclin-dependent kinase inhibitor FAR1

Far1U SBO:0000245

urn:miriam:uniprot:P21268

Table 3.8: A detailed breakdown of recovered (in the MATCH column) and new or modified annota- tion (in the CHANGED columns) by Saint for the pheromone model, starting from an intact model. Saint discovers all annotation found in Table3.6plus the annotation shown in this table. Only names, SBO terms and Saint-supported URIs are shown in this table. Additionally, the large number of new GO terms are omitted for brevity. Whether the original or new species name is used is entirely at the discretion of the user, and these particular examples are not necessarily the only choice a user could make.

quality of the suggested annotation. The system makes it easy for modellers to add explicit biological knowledge to their models, increasing a model’s usefulness both as a reference for other researchers and as an input for further computational analysis.

While Saint allows the user to see which data source provided the prospective annotation, annotation accepted by the user does not itself have any provenance information within the completed model. At the moment, both SBML and CellML are unable to make statements about annotations. The proposed annotation package for SBMLLevel 3 will address this and other issues [10], while the updated metadata specification for CellML will also allow such provenance statements13.

Saint can save time as well as add information that might not otherwise be introduced without the help of specialists like BioModels curators. Saint can be useful either as a first-pass annotation tool or as a way of adding useful cross-references and possible new reactions. Queries in Saint are sent to the data sources asynchronously, letting the user work on each result as it is returned, without having to wait for a complete response. With Saint, the user examine retrieved annotation, chooses the required annotations, and then moves on to the next task. In general, Saint returns a large amount of prospective annotation for the modeller to examine. Not all of it will be useful; many suggested reactions will be beyond the scope of the uploaded model, and suggested ontology terms might be

too general to be useful. However, Saint provides modeller-directed annotation, where some level of filtering is expected.

By examining how Saint behaves with both stripped and intact models, we have isolated the two main query types used within the application: id / name querying and querying based on pre-existing

MIRIAMresources. The use of URIs to retrieve information is an effective method of bypassing the variable quality of identifiers and names.

It is difficult for automated procedures such as Saint to disambiguate the identifiers and names used to identify species. While a number of checks and automated modifications can and are performed, many different naming conventions are used. Saint performs well with a minimal amount of data, thus allowing annotation of models that are still at an early stage of development. However, the use of a common naming scheme would improve the coverage of new annotations.