• No se han encontrado resultados

Restricción y modicación

Parte II: Estado de la cuestión

2.4 Enzimas que modican la estructura del ADN

2.4.6 Restricción y modicación

In this section we describe the characteristics of the different data sets used in our evaluation. Some are useful for multiple emotion detection tasks (e.g. emotion ranking, emotion classifica- tion), whilst others are useful only for a particular task.

3.6.1 Emotion Detection Datasets

We present publicly available data sets that are either manually labelled emotion data sets or obtained using a distant supervision methodology, which is exploited to generate a data set automatically gather weakly-labelled emotion data. This is possible given the abundance of loosely tagged data (e.g. tweets) with emotion markers (e.g. emotional hashtags, emoticons etc ) commonly on social media (e.g. Twitter). Also in the related field, Sentiment Analysis research has demonstrated the usefulness of distant labelled data to learn accurate supervised models [18],[45]. Therefore in this research we also leverage the availability of weakly-labelled emotion data on social media (e.g. Twitter) to learn word-emotion lexicons.

3.6.1.1 News data set (SemEval-2007)

A collection of 1250 emotional news headlines harnessed for evaluating the connection between emotions and lexical semantics at the SemEval-07 workshop [89]. Each headline was provided with emotion ratings in the range [-100, 100] for the Ekman basic emotions. We used this data set for emotion classification, by considering the highest rated emotion for each headline as the class label. Table3.6(columns 2 and 3) shows the distribution of different emotion classes in the training and test sets. The dataset is comparatively small with a considerable skewed class distribution. We are particularly interested to explore how the generative DSEL based features compare to baseline features. We expect that the smaller dataset size combined with the skewed distribution makes this an interesting dataset for comparison purposes.

Emotion News (SemEval-07) Twitter Blogs Incident Reports # Training # Test # Training # Test # Training # Test # Training # Test Anger 67 23 57310 6496 140 36 816 204 Disgust 35 20 - - - - 815 203 Fear 155 33 12592 1548 91 41 815 204 Joy 358 75 73098 8235 416 69 815 204 Sadness 201 61 62611 7069 136 57 815 204 Surprise 184 38 - - 91 16 - - Love - - 30117 3464 - - - - Guilt - - - 815 204 Shame - - - 816 203

TABLE3.6: Emotion Datasets

3.6.1.2 Twitter Dataset

A collection of 0.28 million emotional tweets3crawled from the Twitter search API using tweet identification numbers provided by [28]. Here emotion labels in the data set correspond to Parrot’s primary emotions [23]. We used this data set for emotion classification (stratified 10- fold cross validation). Table3.6(columns 4 and 5) shows the average distribution of the different emotion classes over the 10 folds. As is evident from the table, not all emotions are strongly expressed in this data set. Emotions such asjoy, sadnessare more common compared to others likefear, surprise. Therefore it would be interesting to see how the different methods compare in performance given such class imbalance.

3.6.1.3 Blog Dataset

A collection of 5500 blog sentences annotated with Ekman basic emotions by 3 annotators with an average inter annotator agreement (kappa of 0.76) [30]. We used this data set for document classification using stratified 5 fold cross validation (not 10 fold due to the smaller size of the data set). Table3.6(columns 6 and 7) shows the average distribution of different emotion classes over the folds. The emotion class distribution is highly skewed towards the emotionjoy. Further the smaller size of the data set is likely to challenge the modelling of the weakly represented emotions likefear, surprise.

3

3.6.1.4 Incident reports data set (ISEAR)

A collection of 7000 incident reports obtained from an international survey on emotion reactions

4. Each report is an emotion summary, describing the situation which lead the participant to

experience one of 7 emotions:anger, disgust, fear, shame, guilt, joy and sadness. We conducted a stratified 5-fold cross validation experiment on this data set. Table3.6(columns 8 and 9) shows the average distribution of different emotion classes over the 5 folds. Unlike the other data sets the emotion classes here have a near uniform distribution, which is very unlikely in a real word sample. It will also be interesting to observe how closely related emotions such asshameand

guiltmight be differentiated in the classification task.

3.6.1.5 Emotion event Dataset

A collection of 200 tweets describing emotional events [90] following Ekman basic emotions. Each event is annotated with a ranked list of emotions by two annotators with agreement (kappa of 0.68). We used this data set to test the quality of the lexicons on the emotion ranking task. Since this data set is very small, a lexicon learnt on the Twitter data was used here as both data sets are crawled from Twitter. We can also view this as a means to test the transferability of lexicons to different content albeit similar genre.

3.6.2 Sentiment Analysis Datasets

In this section we describe the different benchmark Twitter data sets that are available for exper- imental evaluation of sentiment analysis algorithms.

3.6.2.1 S140 Dataset

A collection of 1.6 million (0.8 million positive and 0.8 million negative) sentiment bearing tweets harnessed by [18] using the Twitter API. Further the data set also contains a collection of 359 (182 positive and 177 negative) manually annotated tweets.

3.6.2.2 SemEval-2013 Dataset

A collection of 3430 (2587 positive and 843 negative) tweets hand-labelled for sentiment using Amazon Mechanical Turk [91]. Note that unlike the S140 test data, there is high skewness in the class distributions. Therefore it would be a greater challenge to transfer the lexicons learnt on the emotion corpus and also those learnt on the S140 training corpus to sentiment classification.

3.6.2.3 SemEval-2015 Dataset

A collection of 1315 words/phrases extracted from sentiment bearing tweets, hand-labelled for sentiment intensity scores [92]. A higher score indicates greater positivity. Further the word- s/phrases are arranged in decreasing order of positivity. We used this data set to validate the performance of different lexicons in ranking words/phrases for sentiment.