• No se han encontrado resultados

In a faceted search and browsing system, three types of interactions with USI have to be considered, i.e. search, browsing and navigation. In the context of this work, these concepts are understood as

5.6. GEOSPATIAL WEB SEARCH ENGINE 113

Service Specification [Operation] Brief Description

Web Catalogue Service (CSW) It supports the ability to publish and search collections of descriptive information (metadata) of data,

services, and related resources.

[DescribeRecord] It allows a client to discover elements of the supported data model. [GetRecords] It allows discovering resources with possibility

to apply spatio–temporal constraints. Web Map Service (WMS) It produces dynamically maps of spatially

referenced data from geographic information. [GetCapabilities] It enumerates layers that might be rendered and

supported parameters (e.g. graphic format).

[GetMap] It produces maps.

Web Coverage Service (WCS) It supports electronic interchange of coverages (values or properties of a set of geographic locations) that represents space–varying phenomena.

[GetCapabilities] It enumerates coverages that might be rendered and supported parameters. [DescribeCoverage] It provides a full description of a coverage.

[GetCoverage] It returns a coverage.

Web Feature Service (WFS) It allows direct fine–grained access to geographic information at the feature and feature property level.

[GetCapabilities] It lists the features that might be requested.

[DescribeFeatureType] It returns a schema description of the requested feature. [GetFeature] It operation returns a document that contains selection of

features (retrieved from a relatively static data store), which satisfy the query expressions specified in the request. Web Processing Service (WPS) It allows invoke processing functionality

at the feature and feature property level. [GetCapabilities] It lists the processes that might be executed. [DescribeProcess] It returns the description of the requested process.

[Execute] It executes requested process.

Table 5.1: The main operation of OGC Web Services. follows:

• Faceted browsing. It is the act of reviewing the collection of resources grouped via a category. • Faceted search. It refers to the usage of search operators that correspond with facet categories. • Faceted navigation. It is the act of switching a facet category when navigating.

A coherent search interface should be designed to help the user in exploring search results. Therefore, the selection of faceted classifications, search operators and their usage to support facet– based interactions are critical to acquire appropriate behaviour of the system.

Facets

The characteristics of the aimed resources have been analysed to define faceted classification. Ac- cording to the design principle about meaningful categories, the following facets have been chosen:

Figure 5.2: Geospatial service taxonomy proposed in Bai et al. (2009) (source: Bai et al. (2009)). • OWSResource. This facet classifies resources from a list according to their type, i.e. service

description, item description and item

• OWSTaxonomy. This facet classifies resources according to the taxonomy of service the re- source is related to.

• Reliability. This facet informs on reliability status of services. It allows connecting to a service monitoring framework. The values are ranges of reliability (i.e. 100%–75%, 75%–50%, 50%– 25%, 25%–0%).

• Domain. This facet classifies resource according to their domain (i.e. the categories are extracted via patterns from the resource URL).

• Provider. This facet classifies resource according to providers defined within the Capabilities document of service the resource is related to (extracted via NER methods).

Two different taxonomies for geospatial services have been studied and evaluated (Bai et al., 2009; Zhang et al., 2009). The service taxonomy proposed in Bai et al. (2009) has been selected because it is lightweight service taxonomy especially useful to capture knowledge around services characteristics, so that geospatial services can be classified according to their service category, particularly what standards are followed. This classification scheme is used in the GEOSS Component and Service

Registry, one of the main elements of the GEOSS architecture. Figure 5.2 shows the taxonomy

proposed in Bai et al. (2009)). The service taxonomy has been restricted to “Service Version” (i.e. only the HTTP binding is supported in this application). The taxonomy is developed in accordance with technical standards. In this way, the extension of OWSTaxonomy vocabulary and resource annotation can be automatised because the Capabilities document provides all information on service

5.6. GEOSPATIAL WEB SEARCH ENGINE 115

Figure 5.3: Multi–layer logical structure of the URN taxonomy (source: Bai et al. (2009)). type that is needed. Domain and Provider vocabularies have to be automatically extensible because the system has to support automatised new resources added to the repository.

The faceted classification used in this work have been developed in Simple Knowledge Organiza- tion System (SKOS) (Isaac and Summers, 2009), a W3C standard for porting knowledge organisation systems to the Semantic Web. The Web Ontology Language (W3C OWL Working Group, 2009) (OWL) has been considered as well. It offers a general and powerful framework for knowledge rep- resentation. This application does not demand such advanced capabilities to define vocabularies required. SKOS is a simple language with just a few features, tuned for sharing and linking knowl- edge organisation systems, and it can be used for this purpose. The vocabularies created are used to annotate the gathered Web resources. Appendix B contains the used SKOS vocabularies. Only the OWSTaxonomy faceted classification has hierarchical structure. Putkey (2011) creates a SKOS– compliant faceted taxonomy that preserves the required hierarchical structure. Here, the usage of URIshelps to preserve the structure of the OWSTaxonomy taxonomy (see Figure 5.3).

Search operators

Table 5.2 summarises the search operators supported by the system. The “service” and the “re-

source” refinement operators refer to the OWSTaxonomy and OWSResource, respectively. If they

are used as free–text operators, the combination of both (e.g. “service:WMS resource:item”) might produce an empty response. The “site” and the “inurl” operators have similar functionality to the corresponding operators supported by existing SEs (e.g. Google). Also the “+” and “–” modifiers are supported to extend or restrict the search (e.g. “inurl:tata –inurl:en –site:com”).

For more effective search of geospatial resources, the proposed system extends text–based search- ing with spatial search capacity. In this work, spatial search is understood as: “give me all resources

Operator Classification Description Free–text

Scheme example

service OWSTaxonomy restriction on “service:urn:ogc:serviceType:WMS” service specification

resource OWSResource restriction on “resource:service” OWS resource type

provider Provider restriction on “provider:’ING”’

OWS provider

domain Domain restriction on “domain:’www.idee.es”’

OWS domain

inurl the text has to appear “inurl:es”

in the resource URL “inurl:es”

location spatial restriction “location:’Washington DC”’

via a toponym

point spatial restrinction “point:’36.533333, -6.283333”’ or

via a coordinate pair “point:’36,533333, -6,283333”’ (might be omitted)

Table 5.2: Searching interface supported by the Geospatial Web Search Engine.

A latitude/ longitude pair coordinates might be used to define the location of interest explicitly. The point coordinates are assumed to be of WGS 84 (NIMA, 2004), a reference system which is commonly adopted in the Web community (e.g. GeoRSS24). The “point” spatial operator can be

omitted because the parser of the free–text query tries to extract a coordinate pair as well. For example “36.533333, -6.283333” as a query will return any geospatial resources that offers data of the area that contains the point defined. The “,” is not necessary as it will be removed during the query pre–processing task. It is also possible to define spatial restriction via the “location” operator because the system is dedicated to support non–expert users as well. However, this kind of the location definition has a disadvantage when comparing to an explicit point. It inherits ambiguity of toponyms, as different places may have the same name (e.g. “Madrid” in Spain or “Madrid” in Iowa, USA). To offer a proper support to the spatial restrictions, it should be translated into explicit coordinates. In the GWSE presented in this work, a toponym is translated into a list of candidate points by means of a gazetteer (Hill et al., 1999), and a user is asked to select the desired location from the list.

While querying the remote SE, the point identified is removed from the request because a general SE treats coordinates as a pure text usually, and such a search produces an empty result frequently. If an SE does not offer any operator with a similar semantic to that of the “location” operator, the place name is used as a free–text in the remote SE query.

5.6. GEOSPATIAL WEB SEARCH ENGINE 117