Chapter 3. The untapped potential of cyanotoxin production. Laboratory vs Natural conditions
3.5 Conclusions
There are three main advantages to using corpus linguistic methods for stylistic analysis:
(i) Language features and patterns which would be difficult or impossible to spot through manual analysis, or by using intuition alone, can be highlighted using
statistical processes (argued by, e.g., Fischer-Starcke 2009:494; Ho 2011:6-7;
and McIntyre 2010:169).
(ii) Automating the identification o f potential deviations in language is more systematic. It reduces the risk o f subjectivity inherent in relying solely upon intuitive choices about what kinds o f language to analyse in literary texts (as argued by, e.g., Ho 2011:6-7; Mahlberg 2009:48; O'Halloran 2007:228 and Stubbs 2005:22).
(iii) Corpus linguistic software tools facilitate the rapid counting and statistical comparison o f words in much larger quantities o f text than could feasibly be analysed by human effort alone (stated by, e.g., Leech and Short 2007:286 and McEnery et al. 2006:5-6).
Despite the above advantages o f using corpus methods to investigate literary texts, there have been a number of criticisms (see Archer 2007; Ho 2011:9-l 1; Jeffries and McIntyre 2010:22, 181-182; Mahlberg 2009:48; Wynne 2006:225-226). These can be summarised as follows:
(i) Corpus investigations are the result o f a subjective and circular process, since the output is pre-determined through finding language features which have already been chosen as interesting. The most well-known critic taking this stance is probably Fish (e.g. 1980, 1996, 2012).
(ii) Corpus methods risk reducing literature to a decontextualised, numerical list o f language features (pointed out, for example, by van Peer 1989:302-305).
Responses to Fish have been made by Craig (2004:277-278), Stubbs (2005:6) and Witmore (2012), amongst others. More generally, stylisticians who use corpus linguistic methods argue that a systematic approach and careful interpretation o f results can greatly reduce the risks o f subjectivity and circularity (see for example Ho
2011:9-l 1 and Mahlberg 2009:48-49). In my study, the data-driven approach itself, discussed in the previous section, helps avoid circularity. This is because the computer identifies a limited set o f potentially interesting results, on a statistical basis, which vastly constrains the element o f personal choice about what to analyse qualitatively.
The decontextualisation o f language features, when they are presented as output in computer-generated lists, can in fact be helpful at the start o f a qualitative analysis. It lessens the potential for introspective choices about what to follow up, by presenting all quantitatively significant items (including those which may be from well-known speeches or characters), without their surrounding "contextual baggage"9.
However, as McEnery et al. (2006:7, citing Leech 1991:14) point out, human intuition is still required to analyse quantitative results in a useful way. It is necessary to distil the computer-generated results which are genuinely linguistically interesting from those which are not, and to demonstrate their value through careful qualitative analysis. This is argued by corpus stylisticians including Culpeper (2009:39-40), Ho (2011), Mahlberg (2009:62) and Semino and Short (2004). As Ho (2011:10-11) emphasises, "[qjuantification and statistics should always be utilized as a means rather than an end, to verify or refute our intuition-based analysis."
In the end, no methodological approach is perfect. I concur with the view o f other corpus stylisticians who uphold the benefits that corpus linguistic methods brings to the investigation o f literary texts, whilst acknowledging that it is an approach which complements (rather than replaces or supersedes) others, such as those in the literary critical discipline (see e.g. Hope 2010:386-387; Lambrou and Stockwell 2007:3; Mahlberg 2009:50; Semino and Short 2004:7-9). It is also worth noting that past criticisms have beneficially resulted in new attempts to strengthen corpus-based
9 1 am grateful to Karen D onnelly for suggesting this useful descriptive term.
approaches to literary texts. As Jeffries and McIntyre argue, stylisticians have
expanded theoretical bases and frameworks for analysis through which corpus results are analysed, now taking in cognitive approaches and "functional and discourse- analytical approaches to language" (2010:22). This helps ensure that the data are not simply extracted and then abstracted from the texts, but are instead interpreted with a close eye on the context and what may be going on between text and reader/audience.
On the basis o f the above discussions, using corpus linguistic methods in my study enables a much more objective, rigorous and systematic comparison o f the language style o f Shakespeare and a range o f his peers than would be within my manual resources. My quantitative data, and the outcomes and conclusions drawn from a close and detailed qualitative analysis o f them, are based on all the dialogue spoken by characters in the plays investigated (i.e. about 1,600,000 words o f EModE dramatic dialogue; see 4.4), rather than on selected extracts and/or the speeches o f just a few characters. That is an inherent limitation faced by researchers using exclusively qualitative methods. My corpus stylistic study is concerned with the relationship between linguistic forms in EModE plays and the meanings these forms have in the construction o f language styles o f characters. Authorial styles are reflected in repeated choices o f language forms with particular meanings and functions in:
• the construction o f characters and plots;
• the creation o f dramatic atmospheres (e.g. humour or suspense); and
• the packaging of dialogue in a way that also communicates a coherent story which the audience can understand and find engaging.
The route to finding the most potentially interesting stylistic features is the quantitatively significant linguistic items identified by the corpus tools, which I discuss further in the next section. First, however, I further clarify my approach by
differentiating between "corpus stylistics" and "computational stylistics", as both involve the application o f statistical methods to investigate literary texts, and there are some substantial existing studies o f EModE plays in the computational stylistics area.