4.3.1 Data
For the analysis of senders’ social sharing activities under different conditions of identity disclosure, I analyze data from Spiegel Online, the most popular German online magazine using proprietary software12. The data on sharing activities on Facebook comes again from the large still-ongoing project (Schiller et al. 2016). For each article published between November 1st and December 28th 2013 the number of Likes on Facebook and number of comments on the Spiegel Online’ s discussion board were collected two weeks after publishing. Users on the discussion board of Spiegel Online can use pseudonyms for their profiles, so that I assume that social sharing takes place under anonymity condition.
Overall, the data set amounts to 3,740 articles. For the analysis I excluded pictures without text (like “Picture of the day”) or series of pictures, video contributions, content related to jokes and comics and articles which were no more available for further textual analysis, like live streams and live tickers. Further, I analyze only content that made it to the top most-read ranking list. As discussed in Section 2.6.2, interest moderates social sharing processes. To eliminate the influence of this variable, I assume that all articles which made it to the top-read ranking are somewhat equal in their interest level. Thus, the final data set contains 1,150 articles.
4.3.2 Measurement of controversy
As discussed in Section 2.6, controversy is a construct that is not easy to measure. For this study, I considered three alternative measurements. First, one could hire human coders asking them to evaluate on a Likert scale (Likert 1932) how controversial some content is, as used, e.g., in Chen
12 I thank Benjamin Schiller, Simon Moselewski, Sebastian Kliehm and Patrick Felka for their excellent support in data collection.
and Berger (2013). This measurement is however less applicable for large data sets. As an alternative to the hand-coded measurement, I consider two proxies.
Figure 10. Structure of comments on Spiegel Online
First measurement constitutes the number of search results to the topic of the article plus the word “controversy” in German. For that purpose, I collect the keywords of the articles from the Spiegel Online website, e.g., “Edward Snowden”, “Cats”, or “Weather” as a description of the article’s topic. Then I implemented a proprietary application which uses the Bing Application Programming Interface (API) to sum up the number of search results when entering each keyword in a “Keyword + controversial“ pair. This approach proofed to lack reliability, as for some keywords, the results were fuzzy and implausible. As pointed out by Chen and Berger (2013), “controversy is on the eye of the beholder” and could be highly subjective for the conversation partner depending on the cultural and personal background and experiences. This could be the reason why this measure for controversy did not perform too well.
Finally, I decide to use a measure similar to the one presented by Gómez et al. (2008). They propose to measure how controversial a conversation is by looking at the number of threads in the discussions on Slashdot. Then, this idea was implemented using web crawlers to collect the comments for each article on the Spiegel Online’s website and count how often users comment on someone other’s post. Figure 10 shows exemplarily the structure of comments on Spiegel Online. The user “senta1958” is quoting or replying to the post by “agua”. I assume that the higher the number of replies to other user’s posts, the more controversial is the conversation topic. Usually, the users are quoting the previous posts if they disagree about the opinions or do not share the attitudes of other users.
4.3.3 Additional controls
As discussed in Section 2.6, there are other factors that influence social sharing. Therefore, I control for the content’s positivity, emotionality and emotional dimensions of anger, sadness and anxiety. I calculate these variables using an automated tool for sentiment analysis LIWC (Pennebaker et al. 2007). Positivity measures the difference between the shares of positive and negative words. Emotionality pertains to the share of positive and negative words. Web crawlers captured the sections where an article was published, e.g., politics, business, science, etc. Finally,
I control for the number of words and the number of images as they can make the content more appealing, a factor that could ultimately drive social sharing (Dellarocas et al. 2015; Strufe 2010).
To examine whether some content is more commented or rather “liked” or both, I employ a variable that measures the ratio of the number of comments to the number of Likes. Before calculating this variable, I normalize the number of comments and number of Likes dividing each value by the respective highest observed values with the purpose to ensure the comparability of these two measures. The variable for the ratio of comments and Likes amounts to one, if the content gains a comparable amount of social sharing occurrences on the discussion board and on Facebook. It takes on values greater than one, if there are significantly higher social sharing activities on the discussion board.
Further, I standardize the variables which measure positivity and emotionality of the content as well as the degree at which it evokes emotions anger, anxiety and sadness. Table 22 provides a brief description of the variables and the data sources.
Variables Source Description
#Likes Captured by web crawler Number of Likes an article received in two weeks after publishing
#Comments Captured by web crawler Number of comments an article received in two weeks after publishing
Ratio #Comments to
#Likes
Calculated from the data set Ratio of comments to Likes
Controversy Captured by web crawler Approximated by the share of threads in the number of comments
Positivity Calculation based on the results of sentiment analysis (LIWC)
Difference between the share of positive and negative words in the article Emotionality Calculation based on the results of
sentiment analysis (LIWC)
The share of positive and negative words in the article
Anger LIWC The share of words in the article related to
anger
Anxiety LIWC The share of words in the article related to
anxiety
Sadness LIWC The share of words in the article related to
sadness
Section Dummies Captured by web crawler e.g. 1= article appeared in science section;
0 = otherwise
Article length LIWC Number of words in the article
# images Captured by web crawler Number of pictures in the article Day t Calculation from the data set Calendar date
Table 22. Model variables