F. Fonología y ortografía
2.3.3 Competencias y contenidos de de B1. Distribución temporal
Table 3.11 presents an overview of the results of the statistical analyses relating to variables describing the temporal organisation of the three investigated function words. The cells corresponding to measures that were not observed for a given function word are crossed out, while the cells where the statistical tests did not find any significant effects on the observed measure are marked with “Ø”. In the following paragraphs, all the statistically significant results will be discussed.
The first variable that relates to the overall temporal properties of the speakers’ production is the articulation rate, which was measured in the speech stretches surrounding the observed function words (in syllables per second). In our sample, we found significantly higher articulation rates in items spoken by the native speakers as compared to both the non-native groups, as well as a difference between the articulation rates in the items produced by the Czech and the Norwegian speakers. The Norwegian
speakers’ articulation rate was still significantly higher than the rate of the Czech speakers. In addition, there was an effect of speaking style. The articulation rates were on average higher in read speech, in all three speaker groups. The absence of an interaction of speaking style and L1 background indicates that the differences between the production mechanisms associated with the two speaking styles manifest similarly in all three speaker groups with different L1 backgrounds.
Table 3.11: Overview of results of statistical analyses relating to variables describing the temporal organisation of the function words in, of and to.
IN OF TO
Articulation rate Style *** L1 ***
Normalised duration Ø Style * L1 ***
L1:style L1 **
Vowel proportion Ø Style *** L1 ** L1:style ** L1 *** Fricative voicing proportion L1 *** Style *** L1:style Plosive release proportion Style *** L1 ***
The consistently slower articulation rates in spontaneous speech may also be partly explained by the higher cognitive demands associated with the replication task used for the elicitation of spontaneous speech. Moreover, the articulation rate variability was higher in spontaneous speech. The slower articulation rates for both non-native groups are in agreement with a number of studies that investigated speech rate or matched sentence durations as measures of fluency in non-native productions. Overall, productions by non-native speakers or speakers with less experience in a given language (e.g. less experienced vs. more experienced L2 speakers, late vs. early bilinguals) were
shown to have lower speech rate or articulation rate, or longer durations of matched sentences than native, or more experienced speakers of the language (e.g. Riggenbach, 1991; Towell et al., 1996; Guion et al., 2000; Cucchiarini et al., 2002; MacKay and Flege, 2004; Kormos and Dénes, 2004; Trofimovich and Baker, 2006; Toivola et al., 2010; for details see the literature overview in Section 1.2.2). As mentioned in Section 2.2.1, there were some differences in the variables relating to the non-native speakers’ experience with English between the Czech and Norwegian speaker group. Moreover, we could speculate about the overall lower exposure to spoken English in the Czech Republic as compared to Norway. It is possible that the slower articulation rates of the Czech speakers compared to those of the Norwegians in this data set are due to these differences related to L2 experience, which would be consistent with previous research (cf. Towell et al., 1996; Guion et al., 2000; Cucchiarini et al., 2002; MacKay and Flege, 2004; Trofimovich and Baker, 2006). On the other hand, an evaluation of the non-native speakers’ fluency using considerably longer random samples of spontaneous speech showed particularly a large between-speaker variation within each of the speaker groups (for details see Section 2.2.2).
As to our results relating to the articulation rate differences between the two speaking styles, the higher articulation rate in read speech in our data is in accordance with previous studies using similar types of speech material (Hirschberg, 2000; Mixdorff and Pfitzinger, 2005; for details see the literature overview in Section 1.2.1.2). On the other hand a number of studies have found an opposite tendency, i.e. higher articulation rates in spontaneous speech (cf. Section 1.2.1.2). We believe that the different tendencies in various studies can be explained by a low number of speakers in some of the studies, as well as by differences in the conditions at which the spontaneous speech was produced. Moreover, our results showing a greater variation coefficient of articulation rate in spontaneous speech are consistent with a study of Koopmans-Van Beinum (1992) which showed a larger speech rate range and variability in spontaneous speech.
The next inspected temporal measure, normalised word duration, reports on the durational relation of the observed words to the surrounding speech rather than on the
actually measured duration, which is significantly correlated with the articulation rate. The normalisation, described in detail in Section 3.3.1.1, removes the variance due to the articulation rate, and the obtained values represent a hypothetical duration, that would be produced at the mean articulation rate. In tokens produced at high articulation rates, the normalised word durations were longer compared to actually measured durations (“stretched”), while in items produced in slow speech the normalised word durations were shortened compared to the raw durations (“compressed”). The results showed that whereas in the word in, no differences in normalised durations were found between the speaker groups with different L1 backgrounds, for the words of and to, the normalised durations still differed between the speaker groups. More precisely, in the word of, the normalised durations were significantly longer in the items produced by the Czech speakers, as compared to both the native speakers’ normalised durations and those produced by the Norwegians. In the word to, on the other hand, the natives had significantly shorter normalised durations than both non-native groups. Regarding the speaking style, there was no effect on the normalised durations of the words in and to, but for the word of, the mean normalised duration was shorter in spontaneous speech. Moreover, there was a significant interaction of the L1 background and speaking style effects: while the Czech speakers’ read items were considerably longer than their spontaneous ones, the difference was smaller for the Norwegian speakers, and native speakers even showed a slightly opposite pattern.
As a first possible reason for the natives’ significantly shorter normalised durations of the word to, we should recall that a significant effect of phonetic voicing in the following segment on normalised durations was found, and the distribution of voiced and voiceless segments following the word to was found to differ between the speaker groups with different L1 backgrounds. In particular, the native speaker group had higher proportions of voiceless following contexts than both non-native groups. This was especially noticeable in spontaneous speech, due to the very low numbers of items with voiceless following segments in the two non-native groups (see Section 3.3.4.3 for more details). Moreover, an analysis of variance with the factors L1 background and speaking style, as well as preceding and following segment’s voicing, no longer found a
significant effect of L1 background. Although the uneven distribution of right context voicing in the three speaker groups may partly explain the normalised duration differences, the mean values still vary noticeably after controlling for context voicing, albeit the difference is no longer significant with the smaller group sizes. We can therefore still assume that the L1 background effect is not merely an artefact of the uneven distribution of voicing in the following segment. In sum, the L1 effect on normalised word durations of the words of and to indicates that in these two words there may exist additional reduction mechanisms beyond the influence of the articulation rate that determine the word durations, and these mechanisms are not fully mastered by non- native speakers. In the word in, on the other hand, any durational differences between the native and non-native productions may be attributed purely to the articulation rate. A possible explanation for this pattern of results would be the existence of weak forms of certain English function words (e.g. Jones et al., 2003: 589; Roach, 1983: 86-93). While in the pronunciation of the words of and to, weak forms are described in addition to the full (strong) forms, the preposition in is usually not included among such weak-form words (Jones et al., 2003: 271, 377, 539; Roach, 1983: 86-93). Although the tokens in the investigated sample did not include cases where a strong word form would be expected (as for example in clause-final position; see Section 3.2.2 for more details about the selected tokens), it may be assumed that non-native speakers have a lower awareness of weak form use and tend to use the strong forms in a wide range of contexts instead. This explanation would be consistent with the observed longer normalised durations of the weak form words of and to in non-native as compared to native production.
The next measure relating to temporal organisation that inspected was the vowel proportion within the observed words. It needs to be remembered that due to the different structures of the three observed words, this measure cannot be expected to show consistent tendencies across the three words. While we found no effect of the speakers’ L1 background on vowel proportion in the word in, in both of and to, there were significant differences in vowel proportion between the speaker groups. In productions of the word of it was the group of Norwegian speakers that differed from
the rest of the speakers, having higher vowel proportion. The vowel proportions of the natives and the group of Czech speakers did not differ significantly. In the word to, on the other hand, the productions of native speakers differed from both non-native groups by lower vowel proportions. In addition to the effect of L1, vowel proportions in the word of varied also depending on speaking style, the mean values being lower in read speech (45%) than in spontaneous speech (52%). This effect was, moreover, in interaction with the L1 background. While the natives showed a moderate effect, this effect was much larger for the Czech speakers. On the contrary, the group of Norwegian speakers showed a weak opposite tendency to the mentioned trend.
First, we should note that a significant effect of phonetic voicing in the following segment on vowel proportions in both of and to, was found in previous analyses. Since we also confirmed that native speakers had a higher number of items with voiceless following segments than the non-native speaker groups in both of and to, we need to consider the possibility that the vowel proportion differences between the speaker groups might be partly due to this uneven context distribution. A closer inspection of the results shows, however, that the effect size of the L1 background is larger than the possibly confounding effect of right context voicing, and other explanations for the L1 differences in vowel proportions should be sought. There may be a simple explanation for the dissimilar patterns of L1 influence in the words of and to. The native-like vowel proportions of Czech speakers in the productions of the word of may result from their very long durations of fricatives (particularly in read speech). When we inspect raw durations of vowels in the three speaker groups, we find that it is in fact the native speaker group that has noticeably shorter vowel durations than both the Norwegian and Czech speakers. But since the Czech speaker group also produced unusually long fricatives, their overall vowel proportion does resemble the ratio of the native speakers. The function word to does not offer such variability in the duration of the consonantal segment, and the values of vowel proportion thus reflect more directly the actual temporal relations. The main difference between the natives and both non-native groups was, as in the case of the word of, the natives’ shorter vowel duration. On the contrary, no difference in vowel proportion between the native and non-native productions was
found for the word in. In sum, native productions of the words of and to tend to have shorter vowels, which is reflected by lower vowel proportions, as compared to those of non-native productions (with the exception of Czech productions of the word of in read speech, characterised by unusually long fricatives). We may speculate that this is the result of the non-natives’ inability to reduce vowels below a certain duration. The native speakers’ drastic reduction of the vocalic element in the word to may also be seen as a result of processes described as schwa absorption (Shockey, 2003: 22-26). In the case of voiceless stops, Shockey argues that the syllabic property of the schwa overlaps with the articulatory quality of the stop. Apparently, non-native speakers do not apply such processes to a comparable degree. In the case of the preposition in, on the other hand, we may assume that the structure of the word, or its lack of a weak form, makes it less prone to be affected by language-specific reduction processes. As for the speaking style effect on vowel proportion and the interaction of the L1 background with speaking style (found in the preposition of), we should again mention the unusually long fricatives in the Czech speaker’s read items. This may be one of the reasons for the seemingly strong effect of speaking style, as well as the strong interaction. The higher vowel proportion in spontaneous speech may also be explained by the [v/0] alteration, that Shockey (2003: 34-35) describes as a connected speech process contributing to preserving a CV- type syllable structure. The weak form [ə] of the word of is typically realised in casual speech when the word is followed by a consonant.
The proportion of voicing in the fricative in the preposition of showed, just as the previous temporal measures in this word, to be affected both by L1 background and speaking style. A significant interaction of these two factors was found as well. Similarly as with the vowel proportion, also in the proportion of fricative voicing the difference was found between the productions of Norwegian speakers and the remaining two groups. The Norwegians produced the fricatives significantly more voiced (85%) than both the native speakers (65%) and the Czech speakers (72%). The overall mean voicing proportion in read items was 65% while in spontaneous speech it was 83%. However, the differences, varied across the speaker groups with different L1 backgrounds. While both the natives and the Norwegian speakers had a smaller
difference between the voicing proportion in read and spontaneous speech (62% vs. 68% for natives, and 80% vs. 90% for Norwegians), the speaking style effect in the Czech speaker group was very large: 56% vs. 90%.
Here too, it needs to be repeated that the very long durations of the Czech fricatives in read speech may be responsible for some of the differences. While the voicing proportion in the spontaneous Czech items was as high as the Norwegians’, their read items have a mean voicing proportion lower than that of the natives. Since the mean duration of the phonetically voiced portion is comparable with the Norwegians’ values, the very low voicing proportion in the Czech read speech seems like a mere consequence of longer fricative durations. On the other hand, the lower values of native speakers were not due to unusually long fricatives, but simply reflected the noticeably shorter mean duration of voiced portions of the fricatives. In addition, there is another circumstance that we should take into consideration when explaining the differences in fricative voicing proportion among the groups based on the speakers’ L1 backgrounds. The distribution of voiced and voiceless following segments was uneven in the speaker groups with different L1 backgrounds, and since this context factor proved to have a strong influence on the amount of fricative voicing, the difference between the speaker groups may have been a result of this uneven context distribution (in particular a higher number of tokens with voiceless following context among the native speakers’ tokens; for more details see Section 3.3.3.3). This explanation was further supported by an analysis of variance with the factors L1 background and speaking style, as well as the preceding and following segment’s voicing. Here, the effect of the factor L1 background was no longer significant. It is, however, difficult to state with certainty, if the difference in fricative voicing proportion between the speaker groups with different L1 backgrounds was a mere artefact of the uneven distribution of voicing in the following segment. We have to be aware that the analysis of variance with four factors is weaker because the data are split into a much higher number of cells with smaller number of observations.
Apart from these incidental reasons, the differences in fricative voicing proportion between the speaker groups can be explained by the different phonological systems of English and of the native languages of the two non-native groups, in particular the phonetic properties associated with phonological voicing contrast and the phonological processes related to voicing. Shockey (2003: 30-31) mentions that in English, a certain amount of phonetic devoicing is expected, since the phonological voicing contrast is signalled by other means such as preceding vowel length. The phonetic correlates of phonological voicing in Norwegian include, as in English, the duration of the segment and the duration of the preceding vowel, as well as the presence of aspiration in some positions (cf. Section 1.3.3). But since segment duration in Norwegian also functions as a cue to the phonological length of vowels, we may assume that its use to signal voicing is somewhat more limited. At the same time, no final devoicing is present in Norwegian, and therefore, the (phonological) voicing distinction is maintained in word- final positions (e.g. Kristoffersen, 2000: 74-75; Husby and Kløve, 2001: 62). It should be pointed out, however, that no research focussed particularly on the presence of phonetic voicing in word-final consonants. In contrast with English and Norwegian, voicing distinction in Czech is mainly signalled by vocal fold vibration, although differences in the duration of voiceless and voiced sounds (in particular, longer durations of voiceless sounds) have also been observed (e.g. Palková, 1997; Machač and Skarnitzl, 2007). In addition, the phonological voicing contrast in Czech is neutralised in pre-pausal positions as well as in obstruent clusters. More details about voicing contrast and voicing assimilation in the studied languages can be found in Section 1.3.3. The above mentioned facts might indicate that whereas native English speakers may produce phonetically partly devoiced fricatives while signalling the phonological voicing with other means, and Czech speakers may fail to produce fully (phonetically) voiced sounds as a result of the transfer of rules applied in their native language, the Norwegians’ productions show much smaller amounts of devoicing, which is also in agreement with the relevant aspects of the phonological system of their native language. However, a study containing more material would be needed to better explain the results relating to this issue.
It is possible that the differences in voicing proportion in the two observed speaking styles are also partly due to the previously mentioned very long fricatives in the Czech read speech. As has been explained above, the long durations of fricatives may be the main reason for the very low voicing proportions in these items, which contributes to the overall speaking style effect, as well as to the significant interaction of the L1