Evaluación de la repitencia y la deserción

1. PLAN DE TESIS

1.4 DELIMITACIÓN ESPACIAL Y TEMPORAL

1.6.5. Evaluación de la repitencia y la deserción

The original recordings of the interactions were downloaded directly from official websites of CCTV or published resources and converted into WAV format audio files. Once the data had been collected and converted, they were transcribed using VoiceScribe, a transcription software developed by the VOICE team. VOICE transcription conventions were used in transcribing the ACE data, which is the source of the Chinese subset used for this study. The mark-up conventions of ACE include 24 notations, which were adapted from the ACE manual (Patkin, 2011):

- Speaker IDs: Each speaker appears in sequence with the first being referred to as S1, the next S2 and so on.

the word. A full stop is used after a word with a falling tone: “.”. - Emphasis: Capital letters are used to show emphasis.

- Pauses: A half second pause is marked as a full stop between two brackets “(.)”. One second is expressed as “(1)”, two seconds “(2)” and so on.

- Overlaps: Angled brackets are used to show over one person speaking at the same time. Examples of overlaps can be seen below.

- Other continuation: An “=” symbolises a quick succession by other speaker. - Lengthening: Lengthened utterances are symbolized with a colon.

- Repetitions: All recorded repetition need to be represented in the corpus. - Word fragments: A hyphen is used to symbolise uncomplete parts of a word.

- Laughter: It is represented by the “@” symbol. Each @ is the equivalent of one syllable of laughter.

- Uncertain transcription: Uncertain utterances are transcribed within brackets.

- Pronunciation variation and coinages: Words that cannot be found in the Oxford Advanced Learner’s Dictionary fall into the pronunciation variations and coinages category, marked as PVC.

- Onomatopoeic noises: Utterances that are represented by sounds other than words to describe something are represented by IPA symbols.

- Non-English speech: Other language utterances are written in the style of the native language followed by an English translation if possible. For example, <L1zh> refers to the speaker’s first language as Chinese.

- Spelling out: If a word is spelt out or is pronounced using individual letters, then it is marked with “spel”.

- Speaking modes: the pace, pitch, tone, style and method of speaking are marked, for example <slow>.

- Breath: Breathing is expressed as “hh”.

- Speaker noises: Documenting speaker noises such as <coughs> are limited to the active speaker.

- Non-verbal feedback: In cases where video is available, non-verbal feedback is included like <shake head>.

- Anonymization: Anonymization of people, places and organizations are applied in order to ensure the privacy of participants. However, in this subset, as all recordings are publicly broadcast, no anonymization is needed in these cases.

- Contextual events: Contextual events are contained in curly brackets. - Parallel conversation: Similar to contextual events.

- Unintelligible speech: Unintelligible dialogue is represented by an “x”. One “x” is used for every syllable.

- Transcription borders: to mark the start and the end of the transcribed dialogue.

The different notations are marked and shown in symbols of different colours. Deterding (2013) notes that the VOICE mark-up conventions are quite consistent with those adopted in Conversation Analysis. One of the major differences is the use of angled brackets to show overlaps instead of the traditional use of square brackets. The use of angled brackets makes it possible to show multiple overlaps in interactions with several participants. In this research, 10 out of 18 recordings involved at least three speakers. Therefore, the use of angled brackets made it easier to identify who is speaking at the same time and with/to whom. Here is an example with several overlaps where six participants discuss fashion in China. The content between the same number pairs overlap. As shown in the example, “happening at” uttered by S5 overlaps with “the same” by S2; and “China” by S5 overlaps with “that’s right” by S1.

Example of overlapping: (from File No. 17: Fashion design in China)2

S5: i think a fashion capital is where you have runway shows and show rooms at the same time <2> happening at </2> the same <3> place </3> and the show rooms are actually for buyers and press and the show rooms are really for buyers <4> to go </4> and buy and not like it is in <5> china </5> you have in beijing fashion <6> shows </6> that nobody actually knows for whom

S2: <2> the same </2> <3> yeah </3> <4> china yeah </4> S1: <4> so: </4> <5> that's right </5> do they know

All: @@@=

S4: =there's a lot of media about=

S2: =beijing fashion shows or in <6> shanghai </6>

S1: <6> exactly the runway </6> shows in beijing fashion show has been there for twenty years already and there’s one in shanghai <7> right? </7> and there are few others in say <8> in dalian </8> in in qingdao i don’t know but there are so many all are happening=

S3: <7> yeah </7> <8> in guangzhou yeah </8>

For the lexicogrammatical study of Asian ELF speakers in this research, more mark- ups were added according to the need of the research, for example, the non-standard use of prepositions, lexical innovation, and omissions. The detailed classification of linguistic features is shown in Section 3.2.4. The extra mark-ups provided a quantitative analysis of the lexicogrammatical phenomena of Asian ELF by making simple frequency counts. The frequency of occurrence is crucial in claiming that any distinctive linguistic feature is a typical feature of Asian ELF. To achieve this purpose, the concordance software available on ACE website and a free concordance software Antconc 3.2.1w was used to examine the common non-standard features of ELF in the data.

When using the ACE web concordancer3_{, the researcher entered the search string}

2_{All the examples presented in this thesis were retrieved from ACE data subset with original mark-up}

notations according to the transcription conventions.

and chose the associated word anywhere in the string, or on the left or right of keyword. Data can be selected from all or from certain categories including leisure, education, business, organizational, research and science. The researcher obtained the results by clicking “search for concordances”. It is user-friendly, especially when searching for a certain word string throughout the ACE corpus or within a certain category. However, some functions of the online concordancer are still under construction. For example, searching by speaker nationality is not available. Therefore, another user-friendly free concordancing software Antconc 3.2.1w was also employed in this research. First, extra mark-ups were added to the Chinese data subset of ACE. For example, <tq> indicates tag questions; <om> indicates omission; <ag> indicates grammatical disagreement; <pp> indicates non-standard use of preposition. With the extra mark-ups, the specific features of the data could be shown and counted. When all the mark-ups were entered, the text files were imported into Antconc 3.2.1w for further electronic corpus analysis. Search terms can be shown in context (KWIC: key word in context) and the frequency of its collocation. For example, when searching ASEAN throughout all the files, the result showed the highlighted key word asean in the middle with contexts (see Figure 3.2), and the file names and numbers were also listed on the right to correspond with each example. The total number of concordance hits, which is 58 in this case, was also shown. Furthermore, the collocation of ASEAN can be displayed with frequency by simply clicking collocation at the top. The result of ASEAN collocation (see Figure 3.3) indicated that the most frequent collocated notional word was China, which occurred 16 times together with ASEAN.

Figure 3.2. KWIC search result of ASEAN in Antconc 3.2.1w.

Figure 3.3. Search result of ASEAN collocation in Antconc 3.2.1w.

Antconc can also be used for other purposes. For example, a word list for the corpus is produced to display the repeated “tokens” (individual words) according to their

frequency. Multiple terms can be searched at the same time with “|”, for example, go|went|gone|goes. Context words like “a ... of” can be obtained in advanced search. Characters “*”, “+”, “?” can be used to enable different types of searches. For example,

book* will search all the words with book at the beginning, such as booking, books, bookshop; *book will search all the words which end with book, such as notebook; *book*

for all the words with book, such as notebooks.

In document Repitencia y deserción de los estudiantes de pregrado de la Facultad de Ciencias Psicológicas y la Facultad de Ingeniería, Escuela Ciencias, Carrera Ingeniería Informática de la Universidad Central del Ecuador, durante el período 2003-2009 (página 38-42)