• No se han encontrado resultados

Capítulo 3: Legislación y Jurisprudencia comparada

3.1 Argentina:

The style in which a post is written plays a pivotal role in understanding its credibility. The desired property for a posting is to be objective and unbiased. In our model we use stylistic and affective features to assess a posting’s objectivity and quality.

Stylistic

Consider the following user posting in a health community:

Example III.4.1 “I heard Xanax can have pretty bad side-effects. You may have peeling of skin, and apparently some friend of mine told me you can develop ulcers in the lips also. If you take this medicine for a long time then you would probably develop a lot of other physical problems. Which of these did you experience ?”

This posting evokes a lot of uncertainty, and does not specifically point to the occurrence of any side effect from a first-hand experience. Note the usage of strong modals (depicting a high degree of uncertainty) “can”, “may”, “would”, the indefinite determiner “some”, the conditional “if”, the adverb of possibility “probably”, and the question particle “which”. Even the usage of

III.4. Model Components

Feature types Example values Feature types Example values

Strong modals might, could, can, would First person I, we, me, my, mine, us, our

Weak modals should, ought, need, shall Second person you, your, yours

Conditionals if Third person he, she, him, her, his, it, its

Negation no, not, neither, nor, never Question particles why, what, when, which

Inferential conj. therefore, thus, furthermore Adjectives correct, extreme, visible

Contrasting conj. until, despite, in spite Adverbs maybe, about, probably

Following conj. but, however, otherwise, yet Proper nouns Xanax, Zoloft, Depo

Definite det. the, this, that, those, these

Table III.1: Stylistic features.

too many named entities for drug and disease names can impact the credibility of a statement (refer the introductory ExampleIII.1.1).

Contrast the above posting with the following one :

Example III.4.2 “Depo is very dangerous as a birth control and has too many long term side- effects like reducing bone density. Hence, I will never recommend anyone using this as a birth control. Some women tolerate it well but those are the minority. Most women have horrible long lasting side-effects from it.”

This posting uses the inferential conjunction “hence” to draw conclusions from a previous argument, the definite determiners “this”, “those”, “the” and “most” to pinpoint entities and the highly certain weak modal “will”.

TableIII.1shows a set of linguistic features which we deem suitable for discriminating be- tween these two kinds of postings. Many of the features related to epistemic modality have been discussed in prior linguistic literature [Coates 1987,Westnet 2009] and features related to discourse coherence have also been employed in earlier computational work (e.g., [Mukherjee 2012,Wolf 2004]).

Affective

Each user has an affective state that depicts her attitude and emotions that are reflected in her postings. Note that a user’s affective state may change over time; so it is a property of postings, not of users per se. As an example, consider the following posting in a health community: Example III.4.3 “I’ve had chronic depression off and on since adolescence. In the past I’ve taken Paxil (made me anxious) and Zoloft (caused insomnia and stomach problems, but at least I was mellow ). I have been taking St. John’s Wort for a few months now, and it helps, but not enough. I wake up almost every morning feeling very sad and hopeless. As afternoon approaches I start to feel better, but there’s almost always at least a low level of depression throughout the day.” The high level of depression and negativity in the posting makes one wonder if the statements on drug side-effects are really credible. Contrast this posting to the following one:

Sample Affective Features

affection, antipathy, anxiousness, approval, compunction, confidence, contentment, coolness, creeps, depression, devotion, distress, downheartedness, eagerness, edginess, embarrassment, encouragement, favor, fit, fondness, guilt, harassment, humility, hysteria, ingratitude, insecu- rity, jitteriness, levity, levitygaiety, malice, misery, resignation, selfesteem, stupefaction, surprise, sympathy, togetherness, triumph, weight, wonder

Table III.2: Examples of affective features.

Example III.4.4 “A diagnosis of GAD (Generalized Anxiety Disorder) is made if you suffer from excessive anxiety or worry and have at least three symptoms including...If the symptoms above, touch a chord with you, do speak to your GP. There are effective treatments for GAD, and Cognitive Behavioural Therapy in particular can help you ...”

— where the user objectivity and positivity in the posting make it much more credible.

We use the WordNet-Affect lexicon [Strapparava 2004], where each word sense (WordNet synset) is mapped to one of 285 attributes of the affective feature space, like confusion, ambi- guity, hope, anticipation, hate. We do not perform word sense disambiguation (WSD), and instead simply take the most common sense of a word (which is generally a good heuristics for WSD). TableIII.2shows a sample of the affective features used in this work.

Bias and Subjectivity

A posting is supposed to be objective: writers should not convey their own opinions, feelings or prejudices in their postings. For example, a posting titled “Why do conservatives hate your children?” is not considered objective journalism in a news community. We use the following linguistic cues for detecting bias and subjectivity in user-written postings. A subset of these features has been earlier used in [Recasens 2013,Mukherjee 2014b].

Assertives: Assertive verbs (e.g., “claim”) complement and modify a proposition in a sentence. They capture the degree of certainty to which a proposition holds.

Factives: Factive verbs (e.g., “indicate”) pre-suppose the truth of a proposition in a sentence. Hedges: These are mitigating words (e.g., “may”) to soften the degree of commitment to a proposition.

Implicatives: These words trigger pre-supposition in an utterance. For example, usage of the word complicit indicates participation in an activity in an unlawful way.

Report verbs: These verbs (e.g., “argue”) are used to indicate the attitude towards the source, or report what someone said more accurately, rather than using just say and tell.

III.4. Model Components

Category Example Values #Count Category Example Values #Count

Bias Subjectivity

Assertives think, believe, sup-

pose, expect, imagine

66 Wiki Bias

Lexicon

apologetic, summer, advance, cornerstone,

354

Factives know, realize, regret,

forget, find out

27 Negative hypocricy, swindle,

unacceptable, worse

4783

Hedges postulates, felt, likely,

mainly, guess

100 Positive steadiest, enjoyed,

prominence, lucky

2006

Implicatives manage, remember,

bother, get, dare

32 Subj.

Clues

better, heckle, grisly, defeat, peevish

8221

Report claim, underscore,

alert, express, expect

181 Affective disgust, anxious, re-

volt, guilt, confident

2978

Table III.3: Subjectivity and bias features.

Discourse markers: These capture the degree of confidence, perspective, and certainty in the set of propositions made. For instance, strong modals (e.g., “could”), probabilistic adverbs (e.g., “maybe”), and conditionals (e.g., “if”) depict a high degree of uncertainty and hypothetical situations, whereas weak modals (e.g., “should”) and inferential conjunctions (e.g., “therefore”) depict certainty.

Subjectivity: We use a subjectivity lexicon2, a list of positive and negative opinionated words3, and an affective lexicon4to detect subjective clues in postings.

We additionally harness a lexicon of bias-inducing words extracted from the Wikipedia edit history from [Recasens 2013] exploiting its Neutral Point of View Policy to keep its postings “fairly, proportionately, and as far as possible without bias, all significant views that have been

published by reliable sources on a topic”.

Feature vector construction: For each stylistic feature type fiand each posting pj, we com-

pute the relative frequency of words of type fi occurring in pj, thus constructing a feature

vector FL(pj) = 〈f r eqi j= #(wor d s i n fi) / l eng t h(pj)〉.

We further aggregate these vectors over all postings pjby a user ukinto

FL(uk) = 〈 X pjb y uk #(w or d s i n fi) / X pjb y uk l eng t h(pj)〉. (III.1)

Since our model allows users to interact with other users, and give feedback (reviews/com- ments) on their postings — we also create feature vectors for the users’ reviews to capture whether the feedbacks are credible or biased by the users’ judgment. Consider the review rj ,k

written by user uk on a posting pj. For each such review, analogous to the per-posting stylistic

feature vector 〈FL(pj)〉, we construct a per-review feature vector 〈FL(rj ,k)〉. 2http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/

3http://www.cs.uic.edu/ liub/FBS/opinion-lexicon-English.rar 4http://wndomains.fbk.eu/wnaffect.html