6. Pobreza Energética (P.E.)
6.1. Ayudas para afrontar la P.E
Extensive research has been conducted to show how social media data can be used for predictive purposes. For example, Chung and Mustafaraj [2011] mentioned how Twitter has been used to predict box office revenue and stock market perfor- mance. Twitter has also become a vital campaigning tool for politicians [Tumasjan et al.2010]. Within this context, research is being conducted in which Twitter is be- ing used to predict election outcomes [Chrzanowski and Levick 2012]. Work done by Tumasjan et al. [2010] investigated whether tweets could be used to predict the popularity of political parties in the 2009 German federal elections. Similarly, the study conducted by Chung and Mustafaraj [2011] used tweets to predict the out- come of the 2010 United States Senate special election in Massachusetts. During the 2012 United States Presidential Election, Chrzanowski and Levick [2012] used Twitter to predict the outcome of that election.
The purpose of the study by Tumasjan et al. [2010] was to determine whether political tweets could be used to predict the popularity of the political parties that participated in the 2009 German federal elections. Prior to the 2009 German fed- eral elections, Tumasjan et al. [2010] collected 104 003 tweets posted between 13 August 2009 and 19 September 2009. Six German political parties were mentioned in the tweets, namely the CDU, CSU, SPD, FDP, B90/Die Grünen and Die Linke. Prominent politicians were also mentioned. The collected tweets were posted in German and then translated into English. In order to determine whether Twit- ter could be used predict election results, the total number of tweets mentioning a political party was compared to that party’s election results. The result of this comparison can be seen in Table 2.2.
It can be seen that the percentage of tweets that mention a political party is similar to the percentage of the votes received by that party. Tumasjan et al. [2010] also
Table 2.2: Comparison of total number of tweets and election results for each German political party, taken from Tumasjan et al. [2010]
Party Number of tweets Share of Twitter traffic Election re- sult Prediction error CDU 30 886 30.1% 29.0% 1.1% CSU 5 748 5.6% 6.9% 1.3% SPD 27 356 26.6% 24.5% 2.2% FDP 17 737 17.3% 15.5% 1.7% Linke 12 686 12.4% 12.7% 0.3% Grüne 8 250 8.0% 11.4% 3.3%
calculated and obtained a value of 1.65% for the Mean Absolute Error (MAE), a measure of forecast accuracy. This value was then compared to results from various election polls. Tumasjan et al. [2010] showed that using the number of tweets to predict election results achieved similar accuracy to that obtained in traditional election polls.
The study by Chung and Mustafaraj [2011] followed an approach similar to that of Tumasjan et al. [2010] to predict the results of the 2010 United States Senate special election, held in Massachusetts. Chung and Mustafaraj [2011] used the Twitter API to collect tweets mentioning either Martha Coakly or Scott Brown, the two candidates in the election. A total of 23 467 tweets were collected from 13 January to 20 January, 2010. The tweets were preprocessed by removing hash tags, user names, URLs and emoticons.The study showed that 53.86% of the tweets mentioned Martha Coakly and 46.14% mentioned Scott Brown. However, this did not correspond to the election results, which Scott Brown won by receiving 52% of the votes, compared with Martha Coakly’s 48%. This result prompted Chung and Mustafaraj [2011] to argue for tweet sentiment to be taken into account, since tweets may reflect an opposing rather than a supportive sentiment towards a candidate. Chung and Mustafaraj [2011] used an unsupervised method, in the form of the SentiWordNet classifier, to classify the sentiment of tweets. This approach achieved an accuracy of 69.03%. According to Chung and Mustafaraj [2011], the
accuracy of the sentiment classifier was not reliable enough to predict the outcome of the election.
In Chrzanowski and Levick [2012] the aim was to predict how a Twitter user would vote in the 2012 United States presidential election involving two candi- dates: Barack Obama and Mitt Romney. The task was to predict whether a Twitter user would vote for Barack Obama or Mitt Romney, by identifying users who have publicly endorsed either Barack Obama or Mitt Romney. The tweets were stemmed and all numbers were removed. For example, if a tweet originated from an Obama supporter, it was labelled as being a vote for him. Likewise, tweets originating from Romney’s supporters were considered to be votes in his favour. A total of 7.5 million tweets were collected, with 50% coming from Obama supporters and 50% from Romney supporters. The tweets were split as follows: 80% were used for training and 20% for testing. The training set was used to train a SVM classi- fier. The resulting model achieved an accuracy of 94% in predicting whether a user would vote for either Barack Obama or Mitt Romney.
In this research I investigated whether a relationship existed between the num- ber of tweets that mention a particular political party and the number of votes obtained by that party during the 2014 South African general election. In perform- ing this task, I decided to adopt the approach described in Tumasjan et al. [2010] and Chung and Mustafaraj [2011], and to consider the number of political tweets posted by users and the sentiment of those tweets.