Posibles elementos de un proyecto de recomendación sobre información digital

Anexo V. Informe del grupo de contacto 5 sobre información digital sobre secuencias de

A. Posibles elementos de un proyecto de recomendación sobre información digital

A transcription convention is required to enable the oral data to be accurately represented in a manuscript. Furthermore, the oral yielded data in research should be properly coded to mark presumed patterns in the principled rules (Mackey and Gass, 2005). The principled rules are then the transcription conventions.

Transcription conventions should match the objectives of the inquiry in the research. Producing an accurate transcription is a very time-consuming and tedious process. Due to the different research purposes in SLA, researchers will only transcribe the part which interests them; as such, transcriptions can be simple speeches or highly detailed linguistic representations with phonetic renderings of utterances (Bailey cited in Nunan, 1987; Mackey and Gass, 2005). In terms of my

118

study, it does not actually require a phonetic transcription system. However, as suggested, several aspects of the phonetic form were found to be potentially relevant to morpho-syntactic analyses and were then singled out (Pienemann cited in Pienemann and Kessler, 2011; Pienemann, n.d.-a).

In fact, there are no well-recognized conventions utilized in all studies. Instead, researchers usually develop different conventions for diverse studies to facilitate the oral data in a written format and to avoid variations in its use (Creswell, 1994; Mackey and Gass, 2005). Therefore, Pienemann (n.d.-a) has invented the transcription convention regarding the studies in PT, and I have adopted this transcription convention in my data transcription.

It is recommended by Pienemann that no standard punctuation is to be utilized in the utterances, since each language may have its own rules of punctuation which could indicate the pauses, sentence unit and intonations. In a piece of research, punctuation should be determined in the distributional analysis in order to 1) avoid the target language bias, and (2) to identify which parts of speech should be put together for one particular meaning (Pienemann, n.d.-a). However, one issue should be carefully avoided in the data transcription, namely punctuation which might suggest pauses where actually there was no pause in the speaker’s intention.

The transcribed text format should be very simple to comprehend and also easy to handle in the computer software. To put this point on a scale, I have transcribed the

data in a Chinese PinYin form. Besides, Pienemann (n.d.-a) has suggested that contextual information may be vital

for comprehension in the transcription. In the data analysis, contextual information and non-verbal characteristics are represented in double round brackets (()). Utterances which cannot be transcribed are placed inside the single round brackets (). Additionally, if the speech can be partly transcribed, it will be put inside the brackets, but the utterances that cannot be transcribed will be replaced by an X.

Also, the rising tone in a declarative may be a marker of a question (Pienemann, n.d.-a and n.d.-b). In this case, the rising tone in the transcription has been marked for

119 analysis within particular contexts.

As quoted from Pienemann’s transcription convention (Pienemann, n.d.), an equals sign ‘=’ indicates an incomplete utterance where simultaneous speech is present or where the speaker’s utterance is interrupted by the onset of the speech of another speaker. Furthermore, special emphasis on a syllable or word is marked by underlining. Colons following a letter indicate that the sound was lengthened. One colon represents approximately one second or less, such as ye:s. Square brackets [] indicate simultaneous speech, which is placed outside round brackets, where both are used together (Pienemann, n.d.-a).

Trott et al. (2004) stated that the important distinction in coding is to distinguish the specific occurrence of a structure from a distinguishable category of grammar, which is termed as token and type. Tokens represent different types. Types are elements in an account of what a set of tokens represent. A widely used measure of lexical richness is the type-token ratio which measures the ratio of the total number of words in a text and the number of different words in spontaneous data (Zhang, 2002a). Zhang (2002a and 2008) has claimed that the type-token ratio will obviously decline, aligned with the length of the interview. However, Pienemann and Kessler (2007: 11) have already explored that ‘various methods have been proposed to counteract the length effect. Daller et al. (2003) utilise the index of Giraud, which uses the square root of the tokens in the denominator’.

Besides, an utterance is defined as a unit which is ‘potentially complete as a relevant conversational action in its context’ (Liddicoat, 2001: 8). Additionally, the analysis excluded single words/phrases, verbatim repetitions, translations of previous utterances, interjections and fillers (ah, mmh).

In my research, over-generalizations are also included in the count if they express the target grammatical property; in this case, all the relatively correct functions are scored although the form may not be expressed as target-like.

After clarifying the transcription convention in the process of data analysis, the next step is to discuss the rationale and principles for data analysis.

120 5.5.2 Analysis Rationale

Once the research data have been collected with the aid of data collection procedures, the next phase is to analyze the data. In fact, data analysis indicates the process whereby the researcher manages the data so as to draw the appropriate research conclusions (Seliger and Shohamy, 1989). In my research, I have to consider that how much of the data should be selected to represent analyzed utterances, and how much are formulaic utterances based on the acquisition principle which will be discussed in the following section.

Onwuegbuzie and Leech (2005) illustrated seven stages of the mixed-methodological data analysis process, including data reduction, data display, data transformation, data correlation, data consolidation, data comparison, and data integration.

Data reduction aims to reduce the dimensionality and quantity of the data. It transforms the data into a more manageable form. Data display, the second stage, involves conveying the idea that data are presented to allow the conclusions to be correctly drawn (Berg, 2004; Miles and Huberman, 1994). Then, data transformation identifies which data are to be qualitized. Data correlation represented the correlation between the qualitative data and quantitative data. This is followed by data consolidation whereby both quantitative and qualitative data are combined to create consolidated data sets. Data comparison involves comparing data from the quantitative and qualitative data sources. Data integration integrates both qualitative and quantitative data into a coherent whole or separates them as two sets (Onwuegbuzie and Leech, 2005; Pienemann, n.d.-c).

The seven steps are woven into the whole data analysis procedure but basically, the main technique identified in analyzing qualitative/quantitative data is to ‘derive a set of categories for dealing with text segments from the text itself’ (Seliger and Shohamy, 1989: 205). It is to categorize the received data which is descriptive and exploratory in nature, and then the findings would be the discovery of new commonalities or patterns on the basis of the distributive analysis.

121

With my data, I firstly looked at the context for the expected morphosyntactic features selected to be investigated in this study, and where necessary, the contextual clue is marked for analysis. According to Bodomo and Luke (2003: 3), ‘LFG is quite powerful in describing linguistic constructions of Chinese which are of relative sophistication as shown in the Mandarin sentence’. LFG, which helps to receive a whole picture of learners’ linguistic profiling (Sun, 1999), is then implemented to illustrate the grammatical features and functions of the collected speech data to comply with the procedural skills.

On this basis, a quantified distributional analysis is carried out to check the presence or absence of the morphosyntactic structures in each data set. This procedure is used to compare with the hypothesized developmental sequence based on PT across languages (Mansouri, 2002; Di Biase, 2002; Pienemann, 1987; Zhang, 2001 and 2008). All the participants’ language production is therefore distributed according to the PT framework for outlining the developmental process. In addition, I scored the total number of sentences the individual participant produced with respect to implicational scaling. According to Pienemann and Kessler (2007), the analyzed grammatical patterns are then located into the proposed linguistic profiling stages. Ideally, the universal processing hierarchy should coincide with the sequence found in my studies.

However, in this process, a criterion is required to validate the collected grammatical features that are acquired by the L2 speakers. The criterion to judge whether the specific aspect of language is mastered by the L2 learners is the Emergence Criterion developed by Meisel, Clahsen and Pienemann (1981). Emergence Criterion used in PT can be explained as the fact that acquisition is achieved as a certain appearance of a form in an IL production. Details will be extensively explained in the following section.

In document INFORME DE LA TERCERA REUNIÓN (PARTE I) DEL GRUPO DE TRABAJO DE COMPOSICIÓN ABIERTA SOBRE EL MARCO MUNDIAL DE LA DIVERSIDAD BIOLÓGICA POSTERIOR A 2020 (página 187-191)