Octavio Sales Calvo - Síntesis del hardware

Síntesis del hardware

10.2. Octavio Sales Calvo

As we have seen, conversation analysis is very much the prevailing approach to the study of interactions in the media and it is very suited to this purpose. Recently corpus linguistics has also been used in this area and we propose that it has a lot to offer the study of media discourse especially when used in tandem with existing models such as CA, DA and pragmatics. Aijmer and Altenberg (1991: 1) describe corpus linguistics as ‘the study of language on the basis of text corpora’. CL has developed rapidly since the 1960s largely due to the advent of computers and especially their capacity to store and process masses of data. This has facilitated the systematic analysis of large amounts of language and in turn this has meant that descriptions (and prescriptions) about the English language have frequently been contradicted by corpus linguists who work with representative samples of naturally- occurring language (Holmes 1988; Baynham 1991; Boxer and Pickering 1995; Kettermann 1995; Baynham 1996; Carter 1998; Hughes and McCarthy 1998; McCarthy 1998). Essentially a corpus is ‘a large and principled collection of [com- puterized] texts’ in spoken or written form (after Biber, Conrad and Reppen 1998: 4) which is available for analysis using corpus software packages (for further deﬁ ni- tions see Renouf 1997; Sinclair 1997; Tognini- Bonelli 2001). Some debate exists as to whether CL is a theory or a method (see Tognini- Bonelli 2001) or indeed whether it is a new or separate branch of linguistics. As Kennedy (1998) notes, corpus- based research derives evidence from texts and so it differs from other approaches to language that depend on introspection for evidence. Increasingly, CL is being applied to contexts and domains outside the study of language itself where the use of language is the focus of empirical study in a given context. Among the many ﬁ elds where CL is being adopted to complement other methodological tools, such as discourse analysis and conversation analysis, are contexts such as courtrooms (including forensic

linguistics, Cotterill 2003), the workplace (Koester 2000, 2006; McCarthy and Handford 2004), pedagogic and academic contexts (see Farr 2002, 2003; Walsh 2002; Swales 2002; O’Keeffe and Farr 2003; Walsh 2006), political discourse, advertising and the media (Carter and McCarthy 2002; Chang 2002; O’Keeffe 2002, 2003, 2005; Charteris- Black 2004).

In all of these cases CL offers a useful approach to the study of language, allowing for the quantifi cation of recurring linguistic features to substantiate qualitative insights as well as the qualifi cation of quantifi ed fi ndings. In the area of language and the media there has been a growing number of studies which draw on this approach. Coperías Aguilar and Besó (1999) conducted a corpus- based lexical study of Northern Ireland Republican and Unionist party websites (i.e. Sinn Féin and the Ulster Unionist Party) and they show systematic politicization of language in the data. O’Keeffe and Breen (2001) undertook an in- depth comparison of lexico- grammatical markers of stance in newspaper coverage of religious and non- religious child sexual abuse cases in a corpus of almost 700 Irish news- paper articles. Chang (2002) analyses pronoun use in a corpus of Cartalk, a US weekly phone- in on National Public Radio, involving two brothers, Ray and Tom Magliozzi, who answer questions about cars and car repair posed by listeners. She fi nds that the use of pronouns in the radio show closely parallels their distribu- tion in a comparable corpus of casual conversation (see chapter 6 where we look at pronouns in detail). O’Keeffe looks at the discourse of an Irish radio phone- in programme using CL in tandem with CA, DA and pragmatics (O’Keeffe 2002, 2003, 2005). O’Keeffe (2002) focuses on socio- cultural indices of identity encoded in the language of radio phone- ins (see chapter 6). O’Keeffe (2003) uses a corpus- based methodology to examine vague language as a marker of shared knowledge (see chapter 6) in radio phone- in discourse while O’Keeffe (2005) focuses on ques- tioning in the same data (see chapter 4). McCarthy and O’Keeffe (2003) compare the use of vocatives in radio interactions and casual conversation (see chapter 5).

Using corpus linguistics with other approaches

As we have suggested, corpus linguistics offers a useful approach to the analysis of language in the media, allowing for the quantiﬁ cation of recurring linguistic features to substantiate qualitative insights and vice versa. As in the case of CA, the notion of using casual conversation as a comparative baseline works very well in corpus linguistics also. A number of corpora are available which can be used for comparative purposes. In chapters 4, 5 and 6 we will draw on a number of these as comparative reference points against which we can compare results from our media corpus. Corpus software facilitates analytical functions such as word frequency list generation, concordancing, and cluster analysis that are useful when looking at media interactions. Each of these is illustrated below.

Word frequency lists

Corpus software can calculate the word frequency list of a corpus of texts extremely quickly. By doing this, one can obtain a list of all the words in the entire collection of data in order of frequency. This function facilitates enquiry across language varieties and contexts of use. In table 3.2, for example, we compare the ﬁ rst ten words from three different corpora:

1 Friends chatting: a sub- corpus of the Limerick Corpus of Irish English (LCIE) comprising female friends chatting (40,000 words).4

2 The Liveline corpus, a corpus of 44 radio phone- in calls to the Irish radio phone- in show Liveline, broadcast on the Irish public broadcasting station Radio Telefís Éireann, comprising 55,000 words.

3 The Australian Corpus of English (ACE), one million words of written Aus- tralian English including newspapers, ﬁ ction, reports, etc. (see Peters 2001; Hoﬂ and, Lindebjerg and Thunestvedt 1999).

Table 3.2 Comparison of word frequencies for the ten most frequent words

across three datasets

1 2 3

Friends Radio ACE

(LCIE) Liveline

Spoken Spoken Written

Rank order

1 I the the

2 and I of

3 the and and

4 to to to 5 was you a 6 you that in 7 it it is 8 like a for 9 that of that 10 he in was

Even from just the ﬁ rst ten most frequent words of these three corpora, we can see that the friends chatting column (1) and the Liveline radio data column (2) have many items in common, most notably that tendencies emerge in terms of genres and contexts of use. The friends chatting column (1) shows a high frequency of the markers of interactivity typical of spoken English I and you (see Carter and

McCarthy 2006). In the list of ten most frequent words from the ACE column (3), we ﬁ nd a high frequency of articles a and the, indicating a high instance of lexical noun phrases, the preposition of, suggesting post- modiﬁ ed noun phrases and prepositions to, for, in suggesting prepositional phrases. We will use the word frequency list function for comparative purposes in the context of media interactions in chapters 5 and 6.

Concordancing

Concordancing is a core tool in corpus linguistics which allows for the qualitative examination of data. It involves using corpus software to ﬁ nd every occurrence of a particular word or phrase. With the aid of a computer, corpora of millions of words can be searched in seconds. The search word or phrase is often referred to as the node and concordance lines are usually presented with the node word/ phrase in the centre of the line with seven or eight words presented at either side. Concordance lines are usually scanned vertically at ﬁ rst glance, i.e. the analyst looks up or down the central pattern along the line of node word or phrase. Figure 3.6 is from a concordance of the word now from the Liveline corpus of Irish radio phone- ins mentioned above.

Figure 3.6 Concordance lines of now using the Liveline corpus

the best bye bye. NEW CALL Now yesterday ah one of our callers re Welcome back to the programme. Now recently am at the opening of the s so I’ve it was experience so. Now ah before I go to am the next pers u Marian. And likewise indeed. Now Jim. Bye bye. Good

Thank you NEW CALL Now we go on from weighty matters of s o us. IMMEDIATE NEW CALL Now Noel good afternoon to you.

Okay bye bye. NEW CALL Now Ciaran good afternoon to you. Hel you. Go=go=goodbye Josephine. Now Jim. Good afternoon Marian

(run on into) NEW CALL Now Matt. Good afternoon to you. NEW CALL Now Mary good afternoon to you. NEW CALL immediate follow on Now Brian good afternoon to you. That was Nathalie Imbruglia and Torn. Now Emmet good afternoon to you.

That’s REM Losing my Religion. Now Geraldine good afternoon to you. Welcome back to the programme. Now Richard good afternoon to you. oster and Allan and a Bunch of Thyme. Now Geraldine good afternoon. n this village. NEW CALLER Now Tess hello there. Hello Ma

n to you. Hello Marian. Now we were talking yesterday ah about d be the the solution to her problem’’. Now Catherine you didn’t like that did bye bye bye bye. NEW CALLER Now Josephine you want to reassure Bre

Here one can see quite readily that now frequently occurs at the start of a call and that the pattern now + vocative is very common. We also see that in this way now is functioning as a discourse marker and there is also evidence that the now + time reference is also systematic in call openings. We use concordance lines throughout chapters 5 and 6 to examine how lexical and lexico- grammatical items pattern within media interactions.

Cluster analysis

As a corpus technique the process of generating cluster lists is similar to making single word lists. Instead of ranking all of the single words in the corpus in order of frequency, the most frequent combinations of words can be calculated, for example two- word (I mean), three- word (I don’t know), four- word (I don’t believe it), ﬁ ve- word ( you know what I mean), or six- word (at the end of the day) combinations. Using Wordsmith Tools (Scott 1998), table 3.3 gives us the ﬁ rst 20 most frequent four- word combinations from the media corpus that we have assembled.

Table 3.3 Four- word chunks from media corpus

1 thank you very much 2 a lot of people 3 the end of the 4 at the end of 5 at the same time 6 and I think that 7 good afternoon to you 8 good morning to you 9 weapons of mass destruction 10 one of the things

11 I was going to 12 you very much indeed 13 a lot of the

14 in the United States 15 are you going to 16 do you want to 17 I don’t want to 18 it seems to me 19 I have to say 20 no no no no

Multi- word list searches generate all statistically frequent combinations of words but many of these are syntactic fragments and so do not constitute complete syntactic units such as phrases or clauses in a conventional sense, for example from the above list I was going to and a lot of the. As noted by McCarthy and Carter (2002) these items should not be totally dismissed as their high frequency is an indicator of grammatical patterning which is linked to ways of interacting in speech and in writing. Hopper (1998) notes that these fragments give important clues as to how interaction unfolds and how grammar is emergent rather than being pre- existent in interaction. For example, 36 per cent of the concordance lines of the fragment I was going to in the media corpus that we have assembled are used as metadiscoursal items relating to the interview situation such as I was going to ask/say.

Figure 3.7 Concordance lines of I was going to from media corpus

Kristi Yamaguchi, limbering up. I was going to ask you how long do you like to limber up? rush and ﬂ urry. SM: Well, I was going to ask you what or who you blame for what’s what, he’s got a good point. I was going to come on and say that was live singing by

just absolutely went boom. I was going to say all born, and then I said no, that have a good job.’ Darren: ‘OK, I was going to say how much would the ring be worth?

ll be in Saudi Arabia. I was going to say as good as Saudi

Well that’s what I was going to say those people you spoke to who went to .this is my .this is my. I was going to say this is my life, it’s not my life, but Here we see that the key phrase can serve two discourse functions: (1) as a discourse marker and (2) as a downtoner or hedging device. McCarthy and Carter (2002) note in the context of casual conversation that these grammatically incom- plete strings can be seen as ‘frames’ to which new, unpredictable content can be added and that they are best understood as pragmatic markers (that is the different ways of creating speaker meanings in context, such as hedging), rather than syntactic or semantic ones. They argue that we are likely to ﬁ nd the reasons why many of the strings of words are so recurrent by seeing them as frames for pragmatic categories such as discourse marking, the preservation of face and the expression of politeness, acts of hedging and purposive vagueness.

In document Implementación hardware de una red neuronal Long Short-Term Memory (página 154-161)