APLICACIÓN DEL MARCO TEÓRICO DE LA INVESTIGACIÓN En el siguiente capítulo se presenta los aspectos teóricos relacionados con la
2.1.3. Teoría de la Titularidades del Alimentos (T.A)
average of scores of those questions in the series:
F inalScore= 1 3× F actoidScore + 1 3 × ListScore + 1 3× OtherScore (5.6) TAC 2008: Opinion Questions In 2008, NIST made a major change of tracks — splitting QA track from TREC and coupling with Document Understanding Conference (DUC)6in
the new Text Analysis Conference (TAC) [25]. The objective of TAC QA was identical to that of TREC QA: participants were required to retrieve the answers to a set of questions.
The major change went to question types. TAC questions series asked for people’s opinions about a particular target, which were retrieved from blog data. There were two types of questions — rigid list questions and squishy list questions, which asked for exact instances of a specified type, and answer snippets within certain length respectively. The question series 1047 : “Trader Joe’s” comprised the following questions:
• RIGID Who likes Trader Joe’s?
• SQUISHY Why do people like Trader Joe’s? • RIGID Who doesn’t like Trader Joe’s?
• SQUISHY Why don’t people like Trader Joe’s?
The evaluation of rigid list questions was the same as list questions, while the evaluation of squishy list questions was the same as definition questions. The final evaluation score was based on scores of individual series, which were an average of scores of two tasks:
F inalScore= 1
2 × RigidListScore + 1
2 × SquishyListScore (5.7)
5.3. Brief Overview of Alyssa QA System
Our group has developed a statistically inspired open-domain QA system (Alyssa) to participate in TREC/TAC QA evaluations. We give a brief overview how the Alyssa system
works in this section.
For TAC 2008 QA track, our system Alyssa was modified according to new requirements of opinion questions. Figure5.1shows the architecture of Alyssa. Alyssa 2008 defined two streams — an adapted version of our factoid stream in 2007 [85] and a completely new stream which was designed for the questions asking for bloggers. Blogger question detection classified questions into two types using a rule-based approach. The questions asking for bloggers run through both the main stream and the blogger stream whereas the other questions only run through the main stream.
The main stream comprised eight main modules: Question Analysis, Semantic/Polarity Question Typing, Query Construction and Expansion, Document Retrieval, Sentence Retrieval, Sentence Annotation, Answer Extraction, and Answer Validation. We first performed a linguistic analysis to generate structured presentations of questions. The results of syntactic parsing and NE tagging were used later for answer extraction. The semantic type of a question is determined in a separate step called semantic question typing. We adopted a model using support vector machines (SVM), which produced a higher classification accuracy both on the sample questions provided by NIST for TAC 2008 competition and our own set of opinion questions. Beside the semantic question typing, the polarity of opinion questions was determined by the polarity question typing component. A query was formulated from the question with results from those analysis.
Following query construction, we applied query expansion techniques based on Google and Wikipedia. The expanded query was run against document retrieval on the Blog06 corpus. The dynamic document fetching [85] determined the number of retrieved docu- ments according to the question type. The sentence retrieval component retrieved relevant sentences based on language modeling.
The opinion sentence retrieval module selected opinionated sentences from the retrieved sentences. Sentence polarity classification was applied to retrieve opinionated sentences in order to classify the sentences as positive or negative. The sentences with the same polarity typing as their question were chosen for further processing.
5.3. Brief Overview of Alyssa QA System
Squishy list questions did not require exact short answers, we thus directly employed squishy answer extraction to generate answers. Answering rigid list questions requires two types of linguistic processing. If the question asked for an NE from the entertainment domain, we automatically annotated retrieved documents with NEs of the corresponding types. Otherwise, only the opinionated sentences with the correct polarity are annotated.
After the extraction of candidate answers from the annotated documents or sentences, duplication removal was applied. Our web-based answer validation component re-ranked the resulting list of unique candidate answers as the final answers to rigid list questions.
The blogger stream of Alyssa followed after document fetching of the main stream. In the blogger stream the retrieved documents underwent blogger detection to split the document into smaller segments and find the author/blogger of each segment. Each segment was assigned three scores estimated by three different components: topic relevance ranking searching for the relevant segments to the question, opinion classification computing the degree of opinionatedness, and polarity classification measuring how much the polarity of a segment overlaps with the polarity of its question. The interpolation of scores from individual components was assigned to each segment. Finally the segments were ranked in the blogger ranking according to those scores. In the fusion module, the result of blogger questions was merged with the output of the main stream which created a unique list for blogger rigid list questions.
For more details about the components and evaluations of our systems, refer to our participation papers on QA tracks [22,85,109].