The complete frequencies of the PVs with the particle UP are displayed in Appendix A,
in their lemma forms. The second and fourth columns represent the absolute
frequencies of each PV; the third and fifth columns show the normalised frequencies
per million words. 1630 instances were found from CLEC and 363 from LOCNESS. In
terms of verb types, 101 and 80 types were employed in CLEC and LOCNESS
168
Type-token ratios (TTRs) will be calculated for the PVs of each particle group.
Although TTR is not a measure without problems (Granger & Wynne, 2000; Mollet,
Wray, Fitzpatrick, Wray, & Wright, 2010), when used carefully, it is still one of the
easy and intuitive procedures that are commonly adopted in learner language studies
(for example, Cadierno, 2004). In this present investigation, TTR can allow us to
evaluate the diversity of the particle groups across the native and non-native writer
groups.
The TTR used in this study is a modified version of the standard TTR because
not every individual word type and token will be analysed. The purpose is to compare
the proportions of PV types (lemmatised) on the basis of PV tokens in each particle
group between CLEC and LOCNESS. By doing so, we can find the average quantity
of PV types per one hundred cases of PVs which are used by the Chinese learners and
native speakers, respectively.
The TTRs were calculated by dividing the number of PV types with the number of
PV tokens and then converting to percentages. The TTRs in the data of PVs with UP are
6.2% (the PV types (101) are divided by the PV tokens (1630) and multiplied by 100)
for CLEC and 22% for LOCNESS. A higher percentage of TTR indicates more
169 types.
The results of the most frequently used phrasal verbs are documented in Table 6.1.
Only those whose relative frequencies over 90 are picked, to eradicate low frequency
PVs which rank high but do not occur frequently enough. As seen from the table, all of
the top PVs are different in the two corpora except one, GIVE UP, which ranks 2nd and
3rd in CLEC and LOCNESS respectively. This PV is also found to be pronounced in
the German and Italian learner corpora (sub-corpora from ICLE) reported by Waibel
(2007:92), and she makes the interpretation that sometimes this could be a result of
‘topic sensitivity’. The question whether the frequent presence of GIVE UP is
influenced by the article topics will not be pursued in this study, because some of the
titles are not available in CLEC, rendering the analysis of topic effects unfeasible.
Table 6.1: The top PVs in CLEC and LOCNESS
CLEC LOCNESS
Verb type Asb. Rel. Verb type Asb. Rel.
1 GET 161 150 BRING 38 117 2 GIVE 157 147 END 30 93 3 USE 115 107 GIVE 30 93 4 MAKE 106 99 GROW 30 93 5 TAKE 101 94 6 SET 100 93
170
The comparison of high frequency PVs tells us which items are widespread in
individual corpora; the device of ‘over-/under-representation’ can reveal which items
are pronounced in one corpus if the other corpus is applied as the standard. The PVs
which differ largely in the two corpora can be discerned by comparing the normalised
frequencies. All of the verbs were sent to the Log-likelihood Ratio (LLR) testii,
performed by the online calculation tool provided by Lancaster University. The LLR
was chosen as the means to analyse over-/under-representations on account of its
well-established theoretical basis for corpus comparison (see Rayson & Garside, 2000)
and the advantage that it can deal with the absence of data (i.e., when the frequency is
zero), along with taking corpus size into account. Table 6.2 gives the top five
over-represented and under-represented PVs and their Log-likelihood values. A higher
171
Table 6.2: The top five over-/under-represented PVs in CLEC and LOCNESS
Verb type CLEC LOCNESS over-/under-representations LL value
Asb. Rel. Asb. Rel.
USE 115 107 0 0 + 60.86 GET 161 150 4 12 + 59.21 RISE 27 25 0 0 + 14.29 TAKE 101 94 12 37 + 11.97 KEEP 49 46 3 9 + 11.75 BRING 15 14 38 117 - 55.66 END 13 12 30 93 - 41.71 BACK 1 1 8 25 - 17.59 RUN 5 5 11 34 - 14.87 OPEN 3 3 7 22 - 9.79
Note:“+” means “over-represented” and “-” means “under-represented”.
Three of the five over-represented PVs: USE UP, GET UP, TAKE UP, are also the
most frequent items in CLEC. As a matter of fact, the six most frequently-used PVs
mentioned earlier are all over-represented. Besides these three items, the LL values (in
brackets) of GIVE UP (+5.92), MAKE UP (+1.33) and SET UP (+5.54) also suggest
that they are over-represented. The other two over-represented items, RISE UP, and
KEEP UP, do not have high frequencies in the Chinese learner corpus (RISE UP occurs
172
the two corpora are large (the over-/under-representation measures the disparities). In
other words, although these two PVs do not occur very frequently in CLEC, LOCNESS
contains far lower numbers of them from a relative perspective. Among the group of
under-represented PVs, BRING UP and END UP, are also the two most frequent PVs
in LOCNESS. The large numbers of occurrences (their normalised frequencies are 117
and 93) naturally render these two verbs under-represented in CLEC. The other three
items: BACK UP, RUN UP and OPEN UP, rarely occurred in CLEC but were used in a
fair quantity by the native students. Comparing the two groups of
over-/under-representations, we can see the influence of genre types. Verbs such as
GET, RISE, TAKE in the over-represented group are used to describe activities in daily
life, while BRING, END, BACK seem to have more relation with arguments, for
example: bring up an issue, end up with a result, etc.
It turns out the investigation of over-/under-representations does not offer much
useful information for further analyses of PVs, because it is performed in a relative
view so that the over-represented items may have low frequencies, which will render
case studies rather difficult. Thus I will not examine the over-/under-represented PVs in
173