• No se han encontrado resultados

A Semantic Portal

N/A
N/A
Protected

Academic year: 2023

Share "A Semantic Portal"

Copied!
18
0
0

Texto completo

(1)

A Semantic Portal

for the International Affairs Sector

EKAW, 2004

Contreras, Benjamins, Blazquez, Losada, Salle, Sevilla, Navaro, Casillas, Mompo, Paton, Corcho (iSOCO)

Tena, Martos

(Real Instituto Elcano) www.esperonto.net

MCYT, PROFIT

2/27

© Intelligent Software Components, S.A. 2003

The Potential of Semantic Web Technology

ƒ Enable a paradigm switch in searching information

ƒ From

• Information Retrieval

ƒ To

• Question Answering

ƒ If you want …

• Information Overload on Demand

ƒ This paper illustrates an application in this line

• For one particular domain

• Promising, but still a lot of work to do

Forward

(2)

3/27

© Intelligent Software Components, S.A. 2003

Google: ¿A qué organizaciones pertenece España

What Organizations is Spain a member of?

(3)

5/27

© Intelligent Software Components, S.A. 2003

RIE: ¿A qué organizaciones pertenece España

Spain member Organizations

6/27

© Intelligent Software Components, S.A. 2003

¿A qué organizaciones pertenece España?

What Organizations is Spain a memberof?

(4)

7/27

© Intelligent Software Components, S.A. 2003

Of what Organizations is Spain a member?

How Does it Work?

Ontology of International Affairs Knowledge Acquisition

Exploitation Conclusions

A Semantic Portal for the International Affairs Sector

(5)

9/27

© Intelligent Software Components, S.A. 2003

The Overall Process

How does it work?

10/27

© Intelligent Software Components, S.A. 2003

Ingredients

ƒ Multiple, heterogeneous sources

ƒ Ontology

ƒ Knowledge Acquisition “Engine”

• Knowledge Parser®

ƒ Exploitation of knowledge

• Publishing the results - Duontology®

• Semantic Navigation

• Semantic Search Engine - NLP queries

- Enriched documents with ontological information -- - Smart tags Smart tags Smart tags

How does it work?

(6)

How Does it Work?

Ontology of International Affairs Knowledge Acquisition

Exploitation Conclusions

A Semantic Portal for the International Affairs Sector

Ontology of International Affairs

ƒ Constructed in collaboration with experts from the Real Instituto Elcano

ƒ Inspired by CIA World-Fact Book

ƒ Ontology metrics (after population)

60 concepts

124 properties

20.000 instances

> 60.000 facts

20Mb in RDF(S) files

ƒ Ontology access and management functionalities built with KPOntology.

• Common API for: JENA1.6, JENA 2.0, Sesame, WebODE

Construction

(7)

13/27

© Intelligent Software Components, S.A. 2003

Ontology of International Affairs

Illustration

Semantic Portal Definition Ontology: International Affairs Knowledge Acquisition

Exploitation Conclusions

A Semantic Portal for the International Affairs Sector

(8)

15/27

© Intelligent Software Components, S.A. 2003

The Sources

CIA World Factbook

NationMaster CIDOB

Supervision

International Policy Institute for Counter-Terrorism

Knowledge Acquisition

Knowledge Parser® Architecture

ƒ Source Pre-processing

ƒ Information identification

ƒ Ontology population

Pluggable Strategies

Intelligent Population of Ontologies Pre-Processing

Types

Knowledge Acquisition

(9)

17/27

© Intelligent Software Components, S.A. 2003

Linking Ontology to Sources

ƒ In Ontology: add link to document where information was found

• Allows navigation from ontology to source

ƒ In sources: add link to ontology concepts

• Allows navigation from sources to ontology

• Only for keywords, based on:

- “Keyword Detection in Natural Languages and DNA”, Ortuno et al, 2002

- Distances between successive word occurrences

Knowledge Acquisition

Semantic Portal Definition Ontology: International Affairs Knowledge Acquisition

Exploitation Conclusions

A Semantic Portal for the International Affairs Sector

(10)

19/27

© Intelligent Software Components, S.A. 2003

Publishing in a Semantic Portal

Exploitation

Semantic Web Publication: Semantic Portal Need for SW information publication on WWW (for humans)

Person

Partner Document

Milestone

Tool

publication

Knowledge Base Browsable Web Site

Inconveniences of direct publication/translation

• Semantic model is not necessary user-friendly (relations, control attributes)

“Traditional” Publishing

Exploitation

(11)

21/27

© Intelligent Software Components, S.A. 2003 Person

Partner Document

Milestone

Tool

publication

Knowledge Base Browsable Web Site

Person

Partner Deliverable

Plan

Publication Ontology

RDQL

Visualization is independent from the Semantic Model

Decoupled Publishing

Exploitation

Ontology viewRDQL ArchitectureCommand Pattern

Internal format (XML)

ArchitectureCommand Pattern

Person Works at

Partner

V. Richard

Benjamins iSOCO

Works at

Knowledge base

(RDF/RDFS)

Employee English

Richard at iSOCO

Publication model

(RDF/RDFS)

WWW Page (HTML)

RDQL Java

Business Logic

XSL Transformations

22/27

© Intelligent Software Components, S.A. 2003

Semantic Search Engine

ƒ Advanced Search

• Searching for data, not for documents (Q/A)

When relations are key

• In addition to keyword-based search engines

• For well-defined domains

Exploitation

Examples

ƒ GDP of Spain

ƒ What countries have participated in the Iraq war?

ƒ To what political party belongs the president of France?

ƒ In which cities Hamas has performed bomb attacks?

ƒ Who is Bush?

(12)

23/27

© Intelligent Software Components, S.A. 2003

Semantic Navigation

ƒ Example: “Países fronterizos con Serbia”

ƒ Returns instances

ƒ Allows reference consulting

Exploitation

Semantic Navigation

• Linking Back to the Sources

- From results (answers) to documents (sources)

Exploitation

(13)

25/27

© Intelligent Software Components, S.A. 2003

Semantic Navigation

ƒ Browsing between ontology and sources

Exploitation

Semantic Portal Definition Ontology: International Affairs Knowledge Acquisition

Exploitation Conclusions

A Semantic Portal for the International Affairs Sector

(14)

27/27

© Intelligent Software Components, S.A. 2003

Conclusions

ƒ Towards a paradigm switch in searching?

ƒ More work to be done

ƒ Detailed failure analysis needed

ƒ Why does Search Engine fail?

• KA limitation - Not in ontology

- Missing/wrong instances

• Query construction - NLP result (too complex) - SeRQL query construction

ƒ Scalability

BACK UP

(15)

29/27

© Intelligent Software Components, S.A. 2003

CIA World fact book

30/27

© Intelligent Software Components, S.A. 2003

Nationmaster

(16)

31/27

© Intelligent Software Components, S.A. 2003

Different Types of Pre-processing

Text

DOM

Render PLN

Layout Language

DOM Text

Source Pre-process Interpretations

Source Preprocess Presentation Structure Text Language

Sources Information Idetification

Operators Identification

Description Hypothesis

Ontology Population

Evaluation Population

Domain

ƒ Plain Text Model

Data

• Regular Expression Check and Retrieval

• Offset References

ƒ DOM/Hypertext

• HTML object identification

• HTTP control and navigation

ƒ NLP

• Basic NLP: Tokenizer, Morphology and Chunk Parsers

• Retrieve phrases using head driven approach

• Basic semantic relations (synonyms, hyponyms, etc.)

ƒ Layout

• Rendered result of a HTML source:

(X,Y) coordinates

• Visual Operators: SAME_ROW, NEAR, etc…

Knowledge Acquisition

Back

Explicit Extraction Knowledge

Wrapping ontology

ƒ Documents

ƒ Pieces

ƒ Relations

• Semantic

• Layout

ƒ Data Types

• Meaning

• Basic Types

• HTML

Source Preprocess Presentation Structure Text Language

Sources Information Idetification

Operators Identification

Description Hypothesis

Ontology Population

Evaluation Population

Domain Data

Knowledge Acquisition

(17)

33/27

© Intelligent Software Components, S.A. 2003

Operators and Strategies

ƒ Operators

• Check: data types, relations, constraints

• Retrieve: obtains piece or document (precondition)

• Execute: navigate, select, etc…

ƒ Strategies (operators applied for hypothesis construction)

• Greedy: quick but not optimal

• Heuristics: hypothesis construction and pruning

• Optimal Backtracking: covering all search space

Source Preprocess Presentation Structure Text Language

Sources Information Idetification

Operators Identification

Description Hypothesis

Ontology Population

Evaluation Population

Domain Data

Knowledge Acquisition

Back

34/27

© Intelligent Software Components, S.A. 2003

Ontology Population

ƒ Actions:

• Create new instance

• Modify existing instance

• Remove existing instance

• Relate existing instances

ƒ Process:

• Hypothesis evaluation

• Population simulation

• Lowest cost simulation algorithm

Source Preprocess Presentation Structure Text Language

Sources Information Idetification

Operators Identification

Description Hypothesis

Ontology Population

Evaluation Population

Domain Data

Knowledge Acquisition

Back

(18)

35/27

© Intelligent Software Components, S.A. 2003

iSOCO Valencia

Tel +34 96 3467143 Oficina 305

C/ Prof. Beltrán Báguena 4, 46009 Valencia Spain iSOCO Barcelona

Tel +34 93 5677200 C/ Alcalde Barnils 64-68 St. Cugat del Vallès 08190 Barcelona Spain

iSOCO Madrid Tel +34 91 3349797 C/Pedro de Valdivia 10 28006 Madrid Spain

For more information on iSOCO, please contact us at [email protected]

www.isoco.com

Intelligent Software Components, S.A.

Contact Information

Referencias

Documento similar

Semantic Portal Definition Ontology of Cultural Content Knowledge Acquisition

Los objetivos del trabajo consisten en define el diseño de una arquitectura base que permita tener una visión general del portal que se pretende desarrollar, hacer una descripción

The personalised recommendations helped users to find relevant news articles, and a semantic expansion of user preferences eased the matching between user and

The system is used to preliminary analyse semantic meanings and contexts of tags belonging to Delicious and MovieLens folksonomies, and to compare semantic

The second system, (reported in[3]), supports semantic search, based on formal domain knowledge, over non-semantic World Wide Web docu- ments. The semantic search system

For this, a synthetic dataset for semantic generation that simulates Super Mario Bros frames has been created, and semantic segmentation models have been trained using this

In [15], a framework to semi-automate the semantic annotation of Web services (i.e. parameter description based on ontology concepts, semantic service classification, etc.)

Algoritmo propuesto de preprocesado y registro de las imágenes portal y radiografía reconstruida digitalmente acondicionamiento sobre la imagen portal ha sido correcto, la