• No se han encontrado resultados

Designing exploration tasks for users is considered an important requirement for evaluating data exploration approaches [164]. Exploration tasks can be characterised as learning or investigative oriented tasks, thus distinguishing them from lookup-oriented tasks [4]. A typical exploration task has to be generic (i.e. the scope of the task is broad and the user don’t have specific information needs), realistic (i.e. real-life task that set in a familiar situation), discovery-oriented (i.e. users travel beyond what they know), open-ended (i.e. requires a significant amount of exploration, where open-endedness relates to uncertainty over the information available, or incomplete information on the nature of the search task), and set in an unfamiliar domain for the user [2, 164, 165].

A task taxonomy for graph visualization and exploration has been proposed in [119]. The taxonomy tasks are categorised into four groups: topology-based tasks, attribute-based tasks, browsing tasks, and overview task. Each task has general descriptions and example scenarios, as the following:

Topology-based tasks: include (i) adjacency tasks (e.g. finding a set of entities directly linked to an entity; identify how many entities are linked to a particular entity; or which entity has the maximum number of adjacent entities.; (ii) accessibility tasks (e.g. find the set of

entities accessible from a particular entity); and (iii) connectivity tasks (e.g. find the shortest path between two entities).

Attribute-based tasks: use the previous topology-based tasks with additional filters relates to entities (e.g. find the entities having a specific attribute value) or related to links (e.g. given an entity, find the entities connected only by certain link types).

Browsing tasks: include follow a given path (e.g. a user is given a set of sequential entities in the data graph to explore), and revisit entities (e.g. return to a previously visited entities in the data graph).

Overview task: is a compound exploratory task to get estimated values quickly (e.g. ask a user to estimate the size of the data graph). Furthermore, overview tasks may include asking the user to identify some patterns in the graph (i.e. types of entities are connected together).

Among the four task categories outlined above, the topology-based tasks and browsing tasks will be adopted in the experimental user study for evaluating exploration paths. Topology-based tasks will be used since the study participants will be given specific entities to explore (i.e. given entities represent the first entities of exploration paths), and then they will be asked questions about how these are associated with other entities in the data graph. These questions correspond to the three cognitive processes from Bloom’s taxonomy [27] (as described in Section 2.7.1): remember (i.e. finding entities in the data graph that are related to the a given entity), categorise (i.e. finding entities in the data graph that the given entity belongs to) and compare (i.e. finding entities in the data graph that are similar to the given entity). Browsing tasks will be used since the participants will be given exploration paths (i.e. EC experimental condition) and will be asked to follow these paths. Furthermore, the semantic data browser (MusicPinta) which will be used in the experimental user study supports: topology-based tasks (i.e. show connections between entities in the graph) and browsing tasks (i.e. enable the user to follow an exploration path represented as a set of entities linked via edge labels). However, attribute-based tasks and overview task will not be used in the experimental user study since the metric for measuring knowledge utility of exploration path (described in Section 2.7.1) considers how entities are connected in the data graph rather than identifying specific attributes relates to a given entity in the graph, or providing estimations about the graph as a whole or identifying specific patterns, respectively. Furthermore, MusicPinta does not provide textual information nor visual representation about the overall data graph.

The authors in [164, 166] applied a two-step approach for designing data exploration tasks for participants: (i)

 Designing a task template that places the participant in a familiar situation which involves exploring multiple entities in an unfamiliar domain (e.g. a researcher at university wants to write a paper (familiar situation) about new topic.

 Identifying unfamiliar candidate entities (e.g. find new research topic) in the domain that could be plugged into the task template.

The main idea of using a task template is to put the users in a familiar situation where they will be asked to find some entities. The study in [164] involved university participants, and hence a familiar situation was writing a paper for a class. Accordingly, the following task template was suggested [164]:

“Imagine that you are taking a class called ____. For this class, you need to write a paper on the topic ____. Use the catalogue to find two possible topics for your paper. Find three books for each topic.”

Using the above template, a task scenario was designed [164], and involved asking participants to find items – to which the specific topics could be plugged into:

“Imagine you are taking a class titled “Great Britain and its Colonies in the Twentieth Century”. For this class you need to write a research paper on some aspect of the relationship between Great Britain and its Colonies in the Twentieth Century but you have yet to decide on one. Use the catalogue to find two possible topics for your paper. Then use the catalogue to find three books for each topic so that you might make a decision as to which topic to write about”.

In this work, we follow similar approach to the one suggested in [164], utilising the two steps described above, as follows:

Designing the task template. We aim to design a generic task template that encourages layman users to seek knowledge in a domain unfamiliar to them. Therefore, we designed the task template in the context of a general knowledge quiz show where users need to acquire as much knowledge as they can. As discussed in Section 6.3, we identified the musical instrument domain (MusicPinta data graph) as our application context for implementing the subsumption algorithms to generate exploration paths for knowledge expansion. Accordingly, we designed the exploration task template in the musical instrument domain using MusicPinta to evaluate the generated exploration paths. Inspired by the task templates in [164], the task template presented in Table 7.1 was designed to suit the musical instrument domain.

Table 7.1 Task template used in the experimental user study Task template

“Imagine that you are a member of a team which will take part in a general knowledge quiz show. You have been asked to explore two musical instruments for 20 minutes in order to prepare a short presentation to describe to your team what you have learned about these instruments”.

As can be seen in the task template, a user will be asked to explore two musical instruments (i.e. unfamiliar entities) and prepare a presentation for his/her team in a knowledge quiz show (i.e. familiar situation).

Identify unfamiliar entities in the domain of the user study. The second step in designing the task template is to identify umfamiliar topics (in our case, unfamiliar musical instruments) to be bugged in the task template in Table 7.1. For this we ran a questionnaire with users to identify the unfamiliar entities in the String Instrument and Wind Instrument class hierarchies in the MusicPinta data graph. These two class hierarchies have the richest class representation in terms of the number of classes and the hierarchy depth as discussed in Section 3.4.1, and have the highest number of knowledge anchors (9 anchors in the String Instrument class hierarchy and 10 anchors in the Wind Instrument class hierarchy – out of 24 anchors in MusicPinta data graph).

We extracted class entities at the bottom quartile of the two class hierarchies (note that the depth of the two class hierarchies is 7 – see Table 3.1, and entities of depth 6 or 7 are considered to be at the bottom quartile of the data graph). This is based on earlier Cognitive science studies acknowledging that layman users are not familiar with specific objects in a domain [167]. Overall 61 class entities from the String Instrument and Wind Instrument class hierarchies were used in the survey. The selected classes were randomised and distributed among twelve participants who are not experts in the musical instruments (the participants have limited knowledge about musical instruments and may have seen the instrument, and none of the participants had played on a musical instrument) using a survey which can be found in Appendix D.1. Each participant was asked to identify his/her familiarity with the musical instruments by selecting one of the following options.

 High (You have good knowledge and have played on the instrument).  Medium (You have some knowledge and have listened to the instrument).  Low (You have limited knowledge and have seen the instrument).

 None of the above.

To identify unfamiliar instruments from each class hierarchy, we chosen the entities which most users were not familiar with (i.e. users were not familiar at all with the instruments names). These entities were the musical instrument ‘Biwa’ (class hierarchy: String Instrument, origin: Japanese) and ‘Bansuri’ (class hierarchy: Wind Instrument, origin: Indian). These musical instrument were used in the task template in Table 7.1, and presented to the participants, as will be described in the next Section.

Documento similar