ANNEX V. INSTITUCIONS I ENTITATS SENSE ÀNIM DE LUCRE
PROTOCOLO DE ACTUACIÓN PARA LA ACOGIDA DEL ALUMNADO RECIÉN LLEGADO
E. EVALUACIÓN DEL PROCESO
6. DIAGRAMA DE FLUJO
A variety of tools were utilized for corpus analysis in this study. Some of these tools are computer-based, while some are web-based. The aim of this section, therefore, is to introduce all the tools employed in this study. These tools can be listed as follows:
RANGE, FREQUENCY, Concordance, AntConc, and the British National Corpus interface designed by Mark Davies from Brigham Young University of Utah, the United States.
RANGE and FREQUENCY are both computer-based text analysis tools. They were programmed by Alex Heatley, and designed by Paul Nation and Averil Coxhead of the School of Linguistics and Applied Language Studies, Victoria University, New Zealand. RANGE is computer software used to compare the vocabulary of texts.
This software, which calculates the range of vocabulary in texts, provides a range or distribution figure (how many texts the word occurs in), a headword frequency figure (the total number of times the actual headword type appears in all the texts), a family frequency figure (the total number of times the word and its family members occur in all the texts), and a frequency figure for each of the texts the word occurs in.
It can be used to find the coverage of a text by certain word lists, create word lists based on frequency and range, and to discover shared and unique vocabulary in several pieces of writing. The program is free for everyone to use and is
downloadable from http://www.vuw.ac.nz/lals/staff/paul-nation/nation.aspx) and it can operate on 32 different texts simultaneously.
The program can be used for multiple purposes. Firstly, it can be used to calculate the coverage of a text by using wordlists such as the GSL (The General Service List) and the AWL (The Academic Word List). It can also be employed to create own wordlists based on range and frequency. Most importantly, it can be utilized to find out common and distinctive vocabulary items in several pieces of writing. The program is accompanied by three ready-made base lists. The first (BASEWRD1.txt) includes the most frequent 1000 words of English and the second (BASEWRD2.txt) includes the 2nd 1000 most frequent words. The first 2000 words come from A General Service List of English Words by Michael West (Longman, London 1953).
The third (BASEWRD3.txt) includes 570 word families from The Academic Word List by Coxhead (2000). All of these base lists include the base forms of words and their derived forms. For instance, the headword aid has the following family
members: aided, aiding, aids, and unaided
/var/www/apps/scribd/scribd/tmp/scratch9/16849819.doc). As mentioned earlier, the program can be used to create own wordlists based on range and frequency.
Recently, Billuroglu and Neufeld (2005) have created their own BNL
(Billuroglu-Neufeld List) wordlist, which is also available as a word list of the 2,709 most common words in English that can be used with RANGE.
FREQUENCY is another program that also runs on a text file (.txt) to make a frequency list of all the words in a single text. Unlike RANGE, it can only run one text at a time. The output is an alphabetical list, or a frequency ordered list. It gives the rank order of the words, their raw frequency and the cumulative percentage frequency (/var/www/apps/scribd/scribd/tmp/scratch9/16849819.doc).
Concordance, designed by Rob Watt, University of Dundee, Scotland, is a copyrighted computerbased program that can be used for evaluation purposes for 30 days. However, a registration fee is paid to the author for longer use. In this program employed for the close study of texts, each word can be seen in its context and also located in the source text. Since words are given in and with their contexts, all the usages of any word in a text or body of writing can be compared, and insight into meaning and usage gained. Using this program, wordlists and word frequency lists can be created, full concordances can be created for texts of any size, collocation counts for each word, up to four words left and right, can be observed, concordance of each word in the source text can be seen by clicking on any word, a wordlist can be lemmatised by grouping chosen words together, and web concordances can be made and published on the web. A major advantage of webconcordances is that they are available to many users at the same time, which make them ideal for class
based activity and studentcentred learning. Users can locate every word in the
source text and can see how it is used within its context.
(http://www.concordancesoftware.co.uk/)
AntConc 3.2.1 was developed in 2007 by Lawrence Anthony from the School of Science and Engineering of Waseda University in Tokyo, Japan. The earlier versions of this computer-based software started out as a simple concordancing program, but slowly progressed to become a useful text analysis tool. The program can run under any windows environment including Win 98/Me/2000/NT and XP, and also Macintosh OSX and Linux computers. The AntConc 3.2.1 includes multiple text analysis tools, but only the ones utilized in this study are presented. The tools of AntConc 3.2.1 used in this study are as follows: ‘Concordance’, ‘File View’,
‘Clusters’, and ‘Collocates’. The Concordance tool generates concordance lines (or KWIC: key word in context) lines from one or more target texts chosen by the user.
At any time, a target file can be viewed in its original form using the File View tool.
The Clusters tool is used to generate an ordered list of clusters that appear around a search term in the target files listed in the left frame of the main window. The clusters can be ordered either by frequency or the start or end of the word. A user can also select the minimum and maximum length (number of words) in each cluster, and the minimum frequency of clusters displayed. It is also possible to select if the search term always appears on the left or right of the cluster.
The Collocates tool is used to generate an ordered list of collocates that appear near a search term in the target files listed in the left frame of the main window. The collocates can be ordered either by frequency, frequency on the left or right of the search term, or the start or end of the word. A user can also select the span of words to the left and right of the search term in which to find collocates, and the minimum
frequency of collocates displayed (http://www.antlab.sci.waseda.ac.jp/software/
READMEantconc3.2.1.txt).
The British National Corpus interface (http://corpus.byu.edu/bnc/) developed by Mark Davies is a web-based tool that makes use of the 100-million-word BNC.
Words, phrases, lemmas (all forms of words, like sing or tall), wildcards (un*ly or r?n*), and more complex searches such as un-X-ed adjectives or verb + any word + a form of ground can be extracted from this tagged corpus. There are six macro registers on the interface, and the frequency of a word or phrase can be observed in all registers. Collocates can also be obtained through the interface. In addition, the use of a word can be compared across registers. For instance, it is possible to obtain information on the words and phrases which occur much more frequently in one register than another, such as -ness words in poetry, adjectives in tabloid newspapers, nouns in advertisements, or verbs in the slot ‘we * that’ in academic writing. Semantically-oriented searches can also be carried out. For instance, the most frequent nouns used with ‘small’ and ‘little’ can be compared. A very useful facility is finding the frequency and distribution of synonyms of a word, and comparing the frequency of synonyms in different registers. For example, the synonyms of ‘strong’ can be compared in the ‘academic’ and ‘news’ registers (Davies, 2004).