• No se han encontrado resultados

Información y documentación sanitaria 1 Normas generales

Allport and Odbert’s (1936) findings that language encodes individual differences in humans established the basis for long-standing research into the relationship of personality traits and language use/linguistic cues. Their lexical hypothesis laid the foundation for the Big Five personality inventory, which has been used many times in research on language and personality and which has become the gold standard in personality research (Mairesse et al., 2007), The Big Five trait taxonomy will be discussed in more detail below. Researchers such as Gottschalk and Gleser (1969) and Weintraub (1989) built on those early studies by investigating how

psychological states can be assessed through content analysis and how verbal behavior relates to personality. While assessments of linguistic style were carried out on several levels, including the word level, morphology, syntax, two paradigms were prevalent in psychological text analysis: (1) the psychoanalytic orientation that requires trained raters to assess individual clauses of a sentence (Gottschalk and Gleser, 1969), or Weintraub’s (e.g. 1989) approach, in which he compared medical diagnoses with 15 general categories into which he ‘inserted’ coded words and phrases, or alternatively, (2) a word-based counting system. This paradigm is based on the assumption that “individuals [who] are verbally expressing sadness […] would be more likely to use words such as sad, cry, loss, or alone“ (Pennebaker & King, 1999, p. 1297).

2.4.1 Automated Content Analysis and Linguistic Inquiry and Word Count

With the advent of faster and cheaper personal computers in the mid-nineties, studies, which included automated word recognition, increased in number. Gottschalk, Stein, and

Sharpiro (1997) first used automated content analysis of speech for the diagnostic processes in a psychiatric outpatient clinic. While this is only a tangent for the present study, it is worth

mentioning as it belongs in the general time line of research focusing on the relationship between content (language) and psychology. Research quickly took off from there. Pennebaker and King (1999) focused on language use as an individual difference and ‘revolutionized’ the field of language and personality interaction with their software program, Linguistic Inquiry and Word Count (LIWC, which will be introduced in the Methodology, Chapter 3, Section 3.3.1), which was used for the first time in their study (including over 2,000 words coded in 74 different word categories encompassing linguistic dimensions such as function words and grammatical

categories, as well as psychological factors such as affective, cognitive, and social processes) and, as will be seen below, has been used in a lot of studies on the subject since (Pennebaker, Booth, et al., 2007). Among their (Pennebaker & King, 1999) most notable findings was a negative correlation between openness and the immediacy LIWC factor (1st person singular,

words longer than six letters, present tense, and discrepancies), and a negative correlation between the Making Distinctions LIWC factor (exclusive, tentativity, negations, inclusive) and extraversion as well as a negative correlation with conscientiousness. In terms of individual word categories, neuroticism was positively correlated to negative emotion words and, conversely, negatively correlated to positive emotion words. Extraversion was positively correlated to positive emotion words and total social references. Agreeableness was positively correlated with positive emotion words and negatively correlated with negative emotion words. As for other demographic variables, Pennebaker and King’s (1999) most notable finding was that a high score on the immediacy dimension was consistently related to young females with lower SAT scores and exam grades, and whose parents had lower levels of education.

2.4.2 Research Related to Language and Personality 2.4.2.1 Early Research

It is not surprising that a lot of research in the field coincided with the skyrocketing number of online presences, websites, blogs, and budding social media sites (Amichai- Hamburger & Ben-Artzi, 2000; Amichai-Hamburger, Wainapel, & Fox, 2002). It is clear,

however, that these studies are over ten years old now, and neither Twitter, nor Facebook had yet been invented. This is necessary to keep in mind as Twitter, as a hybrid genre, might have a different impact on online behavior.

Gill, Oberlander, and Austin (2006), for example, investigated personality in emails at zero-acquaintance.14 Mairesse, Walker, Mehl, and Moore (2007) claim that, at that point, there had only been two other studies which used automatic recognition of personalities in language data, which is why they focused on linguistic cues for automated recognition of personality in conversation and text. A major shortcoming of their study is the fact that both genres, written and spoken, were obtained in a laboratory setting, which does not aptly represent natural language data. It is those shortcomings that Yarkoni (2010) sought to address in his study on personality and language, in which he extended the analysis beyond the category levels and also investigated the relationship between personality and individual words. To counteract the shortcomings of previous studies, such as written samples from laboratory settings, directed writing tasks, short time spans of data collection, and small sample sizes, Yarkoni (2010) used online blogs for his analysis, which represent a valid written genre and natural language data. Using 66 of the 74 LIWC categories (excluding non-semantic words), his study was in great alignment with previous research: Neuroticism was found to positively correlate with negative emotion words

14Zero-acquaintance personality judgment: the rater makes a judgment about a participant [email] with no prior interaction (Cleeton & Knight, 1924).

(e.g. anxiety/fear) and total negative emotions. Agreeableness, on the other hand, showed a positive correlation between positive emotion words and words revolving around social communality (e.g. 1st person plural, family, friends) while at the same time being negatively correlated to negative emotion words and swear words. Diverging from previous research, he (Yarkoni, 2010) found extraversion to be negatively correlated to word categories revolving around goal orientation and work-related achievement (this seems odd and counterintuitive and has, to my knowledge, not been replicated in other studies) and positively correlated to words reflecting social settings or experiences (e.g. bar, restaurant, drinking, dancing). Openness was found to be strongly positively correlated to words associated with intellectual or cultural experiences (e.g. poet, culture, narrative, art). He also found agreeableness to be positively correlated to sexual words (Yarkoni, 2010). His study coincided with one of the first US-German studies on the subject of personality and language (Back et al., 2010).

Küfner, Back, Nestler, and Egloff (2010) looked into the relationship between personality and creative writing (again, convenience sampling as guided data collection procedure in a laboratory setting was used). They had participants (German male/female university students between the ages of 18 and 45) write short stories based on target words along with filling out the Big Five questionnaire. The writing samples were then analyzed with LIWC. Among the first studies that involved social media was Back et al.’s study (2010), which looked into personality and Facebook profiles from US and German social media users. Centering their study around the contrast of the idealized virtual-identity hypothesis versus the extended real-life hypothesis, they found that the users’ actual personality is reflected in their online behavior and that they do not feel obliged to create a self-idealized online personality, which is in alignment with the extended real-life hypothesis. Innovative in their approach, they administered the novel short version of

the Big Five inventory to German users (BFI-10) and the Ten Item Personality Inventory (TIPI) to US users (Gosling et al., 2003; Rammstedt & John, 2007). Furthermore, they used StudiVZ and SchülerVZ, two German social media sites, as German data source even though Facebook had already been around in Germany. This might make their data susceptible to potential skew, and their reasoning behind not using German Facebook users remains unclear (Back et al., 2010). Continued research has involved more and more social media, such as Golbeck, Robles, and Turner’s (2011) study, which bridged the gap between social media and personality research with data from Facebook and taking a more fine-grained approach than Back et al. (2010). Investigations into text messaging as a function of personality traits (Holtgraves, 2011) revealed (and confirmed) significant correlations between several LIWC categories and extraversion (e.g., personal pronouns), agreeableness (e.g., positive emotion words), and neuroticism (e.g., negative emotion words). Interestingly, Holtgraves (2011) also found that linguistic alterations, such as abbreviations, vary according to personality traits and relationship status.

2.4.2.2 Recent Studies on Language and Personality

A more recent study comes from Qui, Lin, Ramsay, and Yang (2012), which, to my knowledge, was the first study using tweets. They use three different sampling methods: snowball sampling, on-campus recruitment, and Amazon’s Mechanical Turk. They analyzed a total of 28,978 tweets collected over the period of one month. While they claim that the sample size is comparable to previous studies, they call for extended research, longer sampling time, more participants, and research in a language other than English to discover possible cross- cultural differences. They found that extraversion was significantly correlated with positive emotion words and social process words, while at the same time being negatively correlated to

the use of articles, supporting Pennebaker and King’s (1999) findings. This seems to point

toward extraverts’ craving for social attention and a preference for reduced linguistic complexity. They report agreeableness to be negatively correlated to negation words (to be expected and previously found in online blogs, (Nowson, 2006)), swear words, and negative emotion words. Further, Qui et al. (2012) were also able to replicate Yarkoni’s (2010) findings (openness is negatively correlated to second-person pronouns, assent words, and positive emotion words in blogs) and Mairesse et al.’s (2007) findings that openness is negatively correlated to past tense verbs in daily language use. Their own, non-replicated findings indicate that extraversion was tied to a higher use of assent words, lower use of function words overall, and fewer impersonal pronouns while openness was positively correlated to the use of prepositions and negatively correlated to the use of adverbs, swear words, and affect words (overall). Participants scoring higher on the neuroticism scale made greater use of negative emotion words and, conversely, used fewer positive emotion words. More agreeable people used fewer exclusive words and fewer sexual words (Qui et al., 2012). The fact that they were able to replicate many of the correlations, positive or negative, indicates that there is a level of consistency between personality factors and language online and offline.

2.5 The Big Five Inventory