• No se han encontrado resultados

UNIVERSITY OF IBADAN LIBRARY

91 each term in each document and query is expected to contribute a weight Wt,d to the score of the document, and this is estiamted to be the combination of the term frequency and the inverse document frequency.

2 . 2 ...

...

...

...

...

,

,d td t

t tf idf

W  

This implies that the term‟s contribution is positively associated with the term‟s frequency within the document (tf) and negatively correlated with the number of documents it occurs in (idf) (Salton and McGill, 1983).

The similarity score between a document and a query is estimated as the closeness of their respective vectors, the closer the two vectors, the more relevant the document is estimated to be to the query. Documents are ranked in this vector space using the cosine of angle between the vectors of the query and the documents in the collection.

3 . 2 ....

...

...

...

...

. ) . ,

( q d

d d q

q

Cos

Where q is the query vector and d is the document vector

|q| and |d| are the Euclidean lenghts of the query and the documents respectively.

The combination of term frequency with inverse document frequency and length normalization (tf-idf) has proved to be the superior weighting scheme with respect to recall and precision.

UNIVERSITY OF IBADAN LIBRARY

92 single ranked results list. The search broker interfaces with various servers to retrieve their results and then apply a result merging method on the returned set.

{S, q} (S‟,q) {(R1,R2,...R|S‟|)q} RM Where q is the query S: the search server. S‟: selected search servers best for answering the query.

During server selection, the search broker selects a set of servers S‟, deemed best to answer the query. The servers‟ choice depends on both effectiveness and efficiency. It is usually assumed that all servers have equal search cost.

Using search broker however has certain flaws, these include the fact that at server selection not all servers with relevant documents are selected. There is also the possibility of selecting servers that have no relevant documents. A search server‟s index can become out of date as documents change.

The rapid growth of the networked environments especially the Internet, enomous available information widely dispersed, as well as quest for information across borders and platforms has increased the complexity of information sources. The multitude, diverse and the dynamic nature of on-line information sources however, make accessing any specific piece of information a difficult task (Brewington et al., 1999).

The use of agents for information retrieval provides a viable solution to these issues.

Agents facilitates access to multiple information sources and the distributed nature of agents facilitate scalabity in the networked environment (Finin and Nicholas, 2000).

Clark and Lazarou (1997); Htoon and Thwin (2008) identified certain functions distributed information retrieval agents are expected to perform, they are as follows:

 Accept requests from human user or other agent client

 Translate these requests to language of the information source or one understood by the information source

 Identify information source that contains information relevant to the request

 Pose the request to the source

 Collect the corresponding results from the sources

 Process the returned results

 Presents the result to the client.

selection Retrieval Mergin

g

UNIVERSITY OF IBADAN LIBRARY

93 2.11.1 Existing Agent Based Information Retrieval

Knowbot (Knowledge-Based Object Technology) collects information by automatically gathering specified information from web sites. Knowbot provides a single query language to access a variety of information sources and it serves as a representative for the user (Finin and Nicholas, 2000). Knowbot is a combination of data and a thread of control that can move among nodes in a distributed environment.

The Knowbot Operating System provides a runtime execution environment which includes security mechanism, support for migration and facilities for communication between Knowbot and other programs. Knowbot is written in an interpreted object- oriented programming language called Python.

Metacrawler is a metasearch engine that queries a variety of search engines and provides a uniform user interface for these search engines. It combines the top web search results from different engines, downloads and scans pages if necessary (Finin and Nicholas, 2000).

Letizia: is a user interface agent that assists users browsing the World Wide Web (Lieberman, 2001). As the user browses, Letizia tracks user behaviour and attempt to anticipate items of interest by doing concurrent, autonomous exploration of links from the user‟s current position. People usually browse depth-first, Letizia browses breadth- first, and it uses a variety of heuristics to identify interesting pages (Finin and Nicholas, 2000). When an interesting page is identified, it displays it in a separate browser window. Letizia is implemented in Macintosh Common Lisp and it uses Netscape as a web browser and user interface. The agent runs as a separate process, and communication between Lisp and Netscape takes place using AppleEvents and AppleScript interprocess communication.

Retsina is a multi agent system (task, interface, information, negotiator agents) that cooperate with outlook based on Resource Description Framework (RDF) files to check appointments for changes autonomously, contact data accessible quickly and agrees with other Retsina users on appointments. It is a personal assistant agent.

However, the agent in this work is built to retrieve information from distributed databases using certain key to search for the information. The agent is written in Java, an object oriented programming language, it is a light-weight object embeded into the

UNIVERSITY OF IBADAN LIBRARY

94 operating system to run as part of the operating system and not on an existing platform.

Our effort is directed at making agents run without passing through an agent platform.