• No se han encontrado resultados

DELITOS CONTRA LA SALUD PUBLICA SECCION PRIMERA

CAPITULO VII SIMULACION DE DELITO

DELITOS CONTRA LA SALUD PUBLICA SECCION PRIMERA

A textual case has at least one of its attributes (either problem, solution or both com- ponents) in free text form. As discussed at the beginning of Section 2.2, retrieval deals mainly with the problem component of cases to determine their similarity to a new prob- lem while reuse deals with the solution components. Text reuse is applicable when the solution is in free text form. The categorization in Figure 2.1 also applies to text reuse techniques; this is reproduced in Figure 2.2 but now with citations of relevant TCBR literatures. Note that the work by Lamontagne & Lapalme (2003,2004) is cited in two different categories since it proposes substitutional and structural text reuse techniques. A key difference between the reuse categories shown for traditional and textual CBR is the unavailability of any current TCBR compositional reuse technique. This requires the combination of textual sub-solutions from several similar cases into a single meaningful solution in response to a query. Such combination of text from several cases is difficult to automate without loss of coherence and overall contextual meaning. The closest work we found related to compositional text reuse is the tutoring library system (Arshadi & Badie 2000) discussed earlier in Section 2.2.2. Here, chapters from different textbooks are combined into a new book in response to a user request. However, we regard this as structured CBR reuse because it utilises a pre-defined topic label (symbolic attribute) attached to each book chapter during reuse rather than its textual contents.

The substitutional form of transformational reuse has been applied extensively to tex- tual cases especially when minimal adaptation is required. This involves the identifica- tion of specific terms in a retrieved textual solution and proposing suitable modifications

2.2. Reasoning with Problem-Solving Experiences 31

Reuse

Non- Generative Generative

(Constructive/ Replay)

Substitutional

• Lamontagne & Lapalme, 2003, 2004 • Zhang et al, 2008

• DeMiguel et al, 2008 (ColibriCook)

Structural

• Lamontagne & Lapalme, 2003, 2004 • Recio-Garcia et al, 2007

• Lamontagne et al, 2007 • Ashley et al, 2009 (LARGO)

• Bridge et al, 2009, 2010 (GhostWriter) • Dufour-Lussier et al, 2010 (TAAABLE)

Verbatim/ Retrieve-only

Transformational Compositional • Gervas et al, 2007

• ???

Figure 2.2: Categories of Text Reuse approaches

due to observed differences between the query and retrieved problem. This approach was used for a substitution based adaptation in some TCBR applications that deal with modification of ingredients to recommend recipes that satisfy a user query (Adeyanju et al. 2008, DeMiguel, Plaza & D´ıaz-Agudo 2008, Zhang et al. 2008). It was also used to makes suggestions for named entities substitution in a TCBR application for automated email response (Lamontagne & Lapalme 2003, Lamontagne & Lapalme 2004). Substi- tutional form of text reuse has also been used in some machine translation techniques. For instance, Example Based Machine Translation (EBMT) (Nagao 1984, Sumita, Lida & Kohyama 1990, Brown 1996, Zhang, Brown & Frederking 2001) retrieves similar textual contents that have previously been translated given some text to translate into another language. A dictionary is then used to substitute mismatched words or phrases between the problem and retrieved text in the newly generated translation.

The structural form of transformational reuse in the context of TCBR involves propos- ing suggestions for adaptation to a textual solution that goes beyond substitution. In other words, there might be suggestions to delete and/or insert terms into specific sections of

2.2. Reasoning with Problem-Solving Experiences 32

a retrieved textual solution without necessarily replacing other sections thereby changing the overall structure of the solution. More sophisticated strategies might also consider the impact of these structural changes to other solution parts. A form of structural text reuse was proposed for report writing applied to the air travel incidents investigation (Recio- Garc´ıa et al. 2007); we denote this technique as jCOLIBRI-Reuse for ease of comparison to others. Here, incident reports have common headings for each section of one or more paragraphs. Text reuse is facilitated by presenting clusters of similar text from other doc- uments for each section while a user is modifying the best match case’s solution. This enables the manual reuse of text from several documents technique thereby altering the overall structure of each section. Though an intuitive form of text reuse, this approach is restrictive since it cannot be used in the absence of common sectional headings across the casebase. Lamontagne and Lapalme (2004) demonstrated Case Grouping (CG), a form of structural text reuse on a semi-automated email response application. This involves the reuse of previous email messages to synthesize new responses to incoming requests. Sentences in a retrieved solution are labelled as reusable or not depending on whether there is sufficient evidence that previous similar problems contain such sentences. Reuse evidence for each sentence is computed by comparing the centroid of two clusters (sup- port and reject) to the query. Only cases that have a similar sentence in their solution belong to the support cluster while the reject cluster contains all other cases. Although the use of similarity knowledge to guide text reuse is novel, CG uses the entire casebase to determine if a sentence can be reused. This will be computationally expensive and seems counter-intuitive since cases with no similarity to the query nor retrieved solution will contribute to reuse evidence. However, such an approach is likely to guide reuse to- wards generic solutions. CG is also more generic than jCOLIBRI-Reuse, since its clusters are not restricted to domains with a common template structure. The GhostWriter sys- tems (Bridge & Waugh 2009, Healy & Bridge 2010) aid text reuse by suggesting features and values or phrases during authoring of a product description for trading (Bridge & Waugh 2009) or its review after purchase (Healy & Bridge 2010). Features, feature values or noun phrase suggestions are iteratively extracted from top previous similar cases using

2.2. Reasoning with Problem-Solving Experiences 33

manually-defined regular expressions or shallow NLP techniques. The list of suggestions is limited using a set of criteria such as ensuring their absence in solution being authored at that point. Another criteria ensures that the length of the phrases are not too short. This approach is similar to jCOLIBRI-Reuse in the sense that suggestions for text reuse change continuously while a user is authoring a new solution. However, users start with an empty solution in the GhostWriter systems rather than a retrieved solution text. An- other commonality between GhostWriter and jCOLIBRI-Reuse is that the problem and solution share a common vocabulary. This is unlike CG where the problem and solution vocabularies are separate though they might share some common terms. Other forms of structural text reuse have involved the use of translation models for word alignment in incident reports (Lamontagne, Bentebibel, Miry & Despres 2007), diagrammatic repre- sentation for legal texts (Ashley, Lynch, Pinkwart & Aleven 2009) and formal concept analysis for adapting recipes (Dufour-Lussier et al. 2010).

Verbatim Reuse is the most common form of text reuse and previous TCBR systems that fall under these category includes ExperienceBook (Kunze & Hubner 1998), FallQ (Lenz & Burkhard 1997, Lenz, Hubner & Kunze 1998), FAQ Finder (Burke et al. 1997), CATO (Br¨uninghaus & Ashley 1998b), PRUDENTIA (Weber et al. 1998), DRAMA (Wilson 2000), InRet (Wilson, Carthy, Abbey, Sheppard, Wang, Dunnion & Drummond 2003) and SMILE (Br¨uninghaus & Ashley 2005). With these systems, the indexing and retrieval mechanism encode most of the semantics in the textual cases, so that retrieved cases are semantically similar, but give no assistance on how to adapt the retrieved so- lution. To the best of our knowledge, there is only one publication (Gerv´as, Herv´as & Recio-Garc´ıa 2007) that can be viewed as constructive or generative TCBR reuse. Here, the main idea is to use Natural Language Generation (NLG) during TCBR reuse. This is done by first transforming textual cases into a structured representation, reuse and adapt using any of the traditional CBR techniques and finally apply NLG to convert the adapted solution (in structured form) back into natural language. However, no experi- mental evidence was reported to justify the success of this approach. The approach is very knowledge intensive with limited applicability because not all textual domains (e.g.

Documento similar