• No se han encontrado resultados

V. Notas sobre la trascripción de nombres y términos árabes

1. ANTECEDENTES POLÍTICOS DEL EGIPTO CONTEMPORÁNEO: DE MU AMMAD AL A LA GUERRA ÁRABE-ISRAELÍ DE 1948 MU AMMAD AL A LA GUERRA ÁRABE-ISRAELÍ DE

1.2. Del Egipto jedival a la proclamación de Independencia

1.2.5. Del Protectorado a la Independencia

This thesis proposes a mechanism for addressing the problem of maintaining coherence in open-ended dialogue between a conversational agent and its user. We use the Intelligent Interactive Toy [Adam and Ye, 2009; Adam and Cavedon, 2009; Adam et al., 2010a;b; Wong et al., 2012a;c] as a platform for testing our hypotheses formulated below. The Intelligent Interactive Toy, or simply the Toy, is a joint project between RMIT University and Re- althing Entertainment Pty Ltd to produce a module-based conversational agent, which can be embodied into a physical toy. The target audience for the Toy is children aged 8 to 12 years. The Toy can interact with users via conversational activities ranging from struc- tured activities such as storytelling and playing games, to unbounded “chatty” dialogue like responding to questions or talking about any topic that its knowledge units handle. The Toy’s capabilities for pre-processing user inputs use shallow natural language processing techniques, while its responses are built from a pre-defined pool of conversational fragments. Part of a conversation using both task- and nontask-oriented interaction and conversational fragments is shown in Figure 2.1. We provide more details on the Toy’s architecture and speech capabilities in the following sections3.

2.1.1.1 Architecture

The Toy is modelled using a Belief-Desire-Intention (BDI) approach, which defines conver- sational goals that the agent attempts to achieve via multiple strategies. Goals in this context include: activating interactive activities; addressing commands or requests; and participat- ing in open-ended conversations about a given topic [Wong et al., 2012a]. The difference between these goals relies on speakers leading the system to achieve these using their inputs. For instance, a cooking tutorial’s goal (describe the preparation method of a dish) is reached when the last instruction is confirmed (see turns 35 and 36 in Figure 2.1).

Two requirements of the Toy provide central motivation to this thesis: a modular ar- chitecture for knowledge that allows its architecture to be extensible; and a mechanism for enhancing the conversation flow capable of managing topic transitioning (i.e. switching from one topic to another) in a coherent manner. We describe these requirements below.

A prominent aspect of the Toy rests in its modularity, not only of its architecture but also of its knowledge organisation; the context of this dissertation is related to the latter.

The architecture of the Toy is distributed in modules, as shown in Figure 2.2. In this

04: SYSM: do you like chocolate? 05: USER: i love chocolate.

06: SYSM: i love it too. i wish i could eat it for breakfast... 07: USER: chocolate for breakfast? yuck!

08: SYSM: what do you eat for breakfast instead of chocolate? 09: USER: toast.

10: SYSM: what do you like to eat on your toast. 11: USER: ice cream.

12: SYSM: yum yum! delicious.

13: SYSM: do you want to make some ice cream? 14: USER: yeah.

15: SYSM: do you have the recipe with you? 16: USER: can you please tell me a story?

17: SYSM: sure. susie the ice cream moose was... . . .

23: SYSM: that’s the end of the story.

24: SYSM: let’s continue with our cooking. so, do you have the recipe? 25: USER: no.

26: SYSM: you’ll need milk and chocolate. 27: USER: where does chocolate come from? 28: SYSM: chocolate comes from cacao trees.

29: SYSM: let’s continue with our cooking, do you need to get out a pan? . . .

34: USER: it’s done!

35: SYSM: yum yum! your ice cream is ready. 36: SYSM: now, let’s talk about the zoo, ok?

Figure 2.1: A fragment of a conversation between the Toy and a user. Source: Wong et al. [2012c].

architecture, a central component (the Interactor ) processes user input, delegates messages and hands control of the conversation to the appropriate modules. Input and output via voice are performed at the User-agent client (e.g. a mobile application). The Toy can be upgraded by “plugging” into it new Conversational-activity modules [Adam et al., 2010b; Wong et al., 2012c], each module containing its own knowledge for the capability it is built upon. For the purpose of this dissertation, a capability module is comprised of a knowledge representation structure, which we term a Modular Ontology, and a pool of conversational fragments associated with the module domain. The Modular Ontology is used to define the vocabulary that the system can understand and converse about, and is also used as a classification mechanism for the conversational fragments featured in the module. We describe the construction of these resources in Chapter 4.

Interactor

User-agent

Role-games capability Chat Storytelling capability

Text-to- speech ASR User Processing unit Input points Output points Triggers Processing unit Input points Output points Triggers Processing unit Input points Output points Triggers

Figure 2.2: General architecture of the Toy.

2.1.1.2 Input and Output Processing in the Toy

In terms of input analysis, the Toy uses shallow natural language processing based on simple shallow parsing and keyword-spotting techniques [Adam et al., 2010b]. As the Toy is capable of dealing with task-oriented activities and nontask-oriented conversation, each user utterance is processed as “chat” unless it matches an activity represented as a BDI plan. Each input is processed in terms of keyphrases, topics, sentiments and requests, using parsing tools and lemmatisers [Wong et al., 2012c]. At the same time, inputs are parsed against a collection of input grammars, which are defined for each domain-capability module. These grammars enable the Toy to detect potential triggers to start dialogue activities [Wong et al., 2012c]. For instance, the input grammar for a story-telling activity is “* tell * story *”, where * represents a wildcard matching a set of words that specify the activity (in this case, the story that the user wants to hear, e.g. “please tell me the story of Snow White”).

To produce a response in a dialogue, the Toy does not generate utterances; rather, it selects an output from a library of pre-scripted utterance “templates” called Conversational

Fragments (CFs) [Adam et al., 2010b; Wong et al., 2012c]. Conversational Fragments are pieces of dialogue that were originally authored by Creative Media students at RMIT Uni- versity [Adam et al., 2010b] and, in more recent work [Wong et al., 2012b], mined from question-answering websites. The use of CFs in dialogue is activated by triggers, which are in general based on the user inputs and word frequency statistics to determine the most appropriate fragment to be used next. CFs can be either sequences or stand-alone pieces, according to the type of dialogue activity that the Toy is performing. This also relates to the source of the CFs: authored pieces of dialogue are assembled in a sequential way, while question-answer sentences are generally independent from each other. For instance, a story such as “Snow White” can be assembled as a sequential set of CFs, whereas the CFs used to respond to user questions about this story (e.g. “what were the names of the seven dwarves?”) are not necessarily part of the sequence. Rather, they are stand-alone pieces of dialogue that can be used without a specific reference to the story; in other words, the user may formulate a question like this at any time, and the system must be able to address it, regardless of the running dialogue activity. We describe CFs in more detail in Chapter 3.

Due to the modular architecture of the Toy, CFs are also distributed in modules. Each module has its own processing unit, which is used to interpret the way to proceed a con- versation as planned by the module designer. The processing unit can choose, given a user response, to continue the story, to stop responding to a question, or to finalise the activity. CFs can also be templates, containing variables that are replaced by instances from user input or conversational context. For instance, the utterance “I know that the $ANIMAL is your favourite animal! ”, where the variable $ANIMAL can be instantiated with any specialisation of the concept Animal, such as Lion or Zebra.

2.1.1.3 Topic Management

Conversational Fragments are organised using the set of terms appearing in them, which we use to represent conversational topics. In this dissertation, we subscribe to the defini- tion of topic of Bublitz [1989] (p. 39): “[. . . ] an independent, usually continuous category which centres the attention of the participants in the conversation, links their linguistic con- tributions and establishes a connection between them (and with them)”. To us, a topic is a conceptualisation that responds to the question What have you been talking about? [Bublitz, 1989].

For example, conversational topics in Figure 2.1 are represented as bold-formatted words. From this figure, it can be observed that topics are maintained and exchanged according to inputs from users. These topics were originally classified using a handcrafted taxonomy [Adam et al., 2010b]. We improve the construction of these taxonomies ahead in Chapter 4,

Capability Module

Modular Ontology (MOnto)

Other Conversational Fragments CF1 CF2 CFn ... Domain-specific ontology components Question-Answer Pairs QAP1 QAP2 QAPm ...

Figure 2.3: Overview of contents in a module. The M-Onto and the conversational frag- ments help to the cognitive part of the Toy, while other components enrich the interactive experience using multimedia or other resources.

using a domain-specific ontology, which we called Modular Ontology or M-Onto. Modular Ontologies can be combined with other M-Ontos from available capability modules and the core ontology contained in the Toy. Figure 2.3 shows an overview of the architecture of M-Ontos.

Modular Ontologies contribute to the Toy architecture by providing information con- tained in a particular domain, thus delimiting the information space that dialogue may cover. Adding more M-Ontos into the Toy enables it not only to consider topics from other domains, but also to find connections between domains. However, managing a conversation requires maintaining a certain degree of coherence; the Toy should not jump randomly be- tween topics, otherwise the context may become unpredictable and users will find it hard to follow conversations. We analyse this requirement below.

Documento similar