Manifestaciones clínicas: - TRABAJO DE FIN DE GRADO

3. Contextualización:

3.1. Antecedentes:

3.1.5. Manifestaciones clínicas:

A comparison expresses a relation between two entities and this relation can be of different types. Linguistics distinguishes between scalar and non-scalar comparisons. Non-scalar comparisons are concerned with whether the compared objects are identical. There are two possible outcomes, the compared objects can either be the same (comparison of equality) or different (comparison of inequality):

(2.23) a. “X is the same as Y” (equality) b. “X is different from Y” (inequality)

36 2. Sentiment Analysis

Alternatively, a comparison can place the two objects on a scale relative to each other (scalar comparison). Scalar comparisons are usually expressed by a gradable adjective or adverb, e.g., “big”. Adjectives or adverbs that denote absolutes, e.g., “dead”, or those that already designate the highest grade, e.g., “excellent”, cannot be used in scalar comparisons. Like in non-scalar comparisons, objects in scalar comparisons can also be either identical or different. When the compared objects are not identical, we can look at their relative placements on the scale, the comparison can either be one of superiority or inferiority3. There is also a special case when the object in question is not compared only to another object (term comparison), but deemed to be the best out of a whole set (set comparison). In English, the forms of adjectives and adverbs are constructed by using the inflectional morphological suffixes “-er” (comparative) and “-est” (superlative) or the analytic markers “more”, “less” (comparative) and “most”, “least” (superlative). As a results, we have the following possible types for scalar comparisons, given here for the examples of “big” and “powerful”, adapted from (Huddleston, 2002):

(2.24) a. “X is as big as Y”, “X is as powerful as Y” (equality)

b. “X is bigger than Y”, “X is more powerful than Y” (inequality, superiority, terms) c. “X is less big than Y”, “X is less powerful than Y” (inequality, inferiority, terms) d. “X is the biggest (of all Z)”, “X is the most powerful (of all Z)”

(inequality, superiority, set comparison)

e. “X is the least big (of all Z)”, “X is the least powerful (of all Z)” (inequality, inferiority, set comparison)

For automatic processing in sentiment analysis, most work follows the comparison types proposed by Jindal and Liu (2006b) and further clarified in (Liu, 2015). Instead of scalar and non-scalar comparisons, they use the categories of gradable and non-gradable comparisons as the major distinction. Gradable comparisons are those that place the two entities on a scale and introduce a ranking between them. They come in three types. The first type is what they call non-equal gradable comparisons, which includes term comparisons of both superiority and inferiority. The second type includes superlatives, i.e., set comparison of superiority or inferiority. The third type refers to comparisons of equality called equatives. The first two relations also have two subtypes that clarify the direction of the relation which they call increasing and decreasing comparatives. Liu (2015) provides the following examples:

(2.25) a. “Coke tastes better than Pepsi.” (non-equal gradable, increasing)

3_{Also called comparisons of majority and minority (Cuzzolin and Lehmann, 2004). Sometimes in the}

class of comparisons is included the elative “X is very big”, the excessive “X is too big” (Heine, 1997), or the assetive “X is big enough” (Bakhshandeh and Allen, 2015).

2.5. Comparisons in product reviews 37

b. “Coke tastes the best among all soft drinks.” (superlative) c. “Coke and Pepsi taste the same.” (equative)

Non-gradable comparisons express a difference between two entities, but do not rank the entities. Again, there are three types. The first type compares two entities in a shared aspect. The second type states that entity X has aspect A and entity Y has a similar aspect B. The third type states that entity X has aspect A and entity Y does not have aspect A. Liu (2015) provides the following examples:

(2.26) a. “Coke tastes differently from Pepsi.” (shared aspect “taste”)

b. “Desktop PCs use external speakers but laptops use internal speakers.” (similar aspects “external speakers” and “internal speakers”)

c. “Nokia phones come with earphones, but iPhones do not.” (aspect “earphones”)

In sentiment analysis research, the automatic classification of comparison types and direction has received more attention than the previous two tasks. Most work on the topic has focused on gradable comparisons only.

Jindal and Liu (2006b) introduce the system of comparison types and are the first to present a system that assigns each gradable comparison a type out of non-equal gradable, equative, and superlative. They use a Naive Bayes classifier with comparison keywords as features. The classification of the direction of the comparison is presented in follow-up work by Ganapathibhotla and Liu (2008). They use hand-crafted rules based on the polarity of the predicate to determine which entity is preferred in non- equal gradable and superlative sentences. Yang and Ko (2011a) adapt the same method for Korean, but they add the labels similarity, pseudo-comparison (i.e., metalinguis- tic comparison), and implicit (i.e., juxtaposition), for a total of seven classes. They do not determine the direction of the comparison.

In their work on Chinese, Hou and Li (2008) do not assign comparison types, but distinguish five possible relation types between entity 1 and entity 2 which roughly correspond to Jindal and Liu (2006b)’s comparison types: better, worse, same, best, and worst. The assignment of a type is based on the tokens extracted for their two-part predicates (predicate and ‘sentiment word’). They do not elaborate, but the processing seems to use manual categories assigned to the extracted words.

The above approaches assign the type and/or direction of a comparison as part of collecting detailed information about comparisons. Other research has been done in a slightly different framework where pairs of entities are extracted and the focus is on determining which of the entities is the preferred one. In this framework, there will always be two entities, whereas in the above approaches one or more of the entities may be implicit. For example the sentence “It is the best”, where the second entity is

38 2. Sentiment Analysis

implicit, will be considered to be a comparison for the above approaches, but not for the approaches that assume two entities.

A baseline method for direction identification looks up the comparison word in a sentiment dictionary (Kurashima et al., 2008; Zhang et al., 2009, 2010). If the word has positive polarity, the first entity is preferred, otherwise the second. To capture contextual influences, a polarity classifier can be used on the sentence (Zhang et al., 2013). Alter- natively, if labeled data is available, a classifier can be trained to distinguish comparison directions into the classes similar, different, and neither (Feldman et al., 2007). None of these works directly evaluate the performance of their direction classification.

Xu et al. (2009, 2011) classify the comparative relation between two entities into four classes: three types of relations between two entities (better, worse, same), plus no_comparison which can be assigned if there is no relation between the entities. Xu et al. (2009) use a multiclass SVM and a maximum entropy model. Their features include manually defined comparison keywords, POS tags, the entity tokens and entity types. Xu et al. (2011) improve on their previous work by using a CRF and additional features from token form (capitalization, numbers, prefixes and suffixes) and dependency parses (syntactic paths between the entities, grammatical roles).

Tkachenko and Lauw (2014) identify which of the two entities in a relation is preferred. They use a generative model based on Gibbs sampling, their features are based on the position of words around the entities plus some negation treatment. In follow-up work, Tkachenko and Lauw (2015) only classify whether a relation exists or not, not the direction of the relation. They propose a dependency tree kernel for SVM that allow ‘skip-nodes’ where a node in the tree may be removed or replaced with a general placeholder which makes the approach better suited to capture similarities between dependency trees. They compare their approach to Jindal and Liu (2006a)’s system and report considerable improvements.

In document TRABAJO DE FIN DE GRADO (página 11-17)