• No se han encontrado resultados

La energía

In document Marzola, María Cecilia (página 30-98)

Capítulo 2. Prácticas espirituales y la energía corporal

2.2 La energía

In most cases of writing assessment, rating scales are used when scoring written work (Montee and Malone, 2014). A rating scale is defined as:

“A scale for the description of language proficiency consisting of a series of constructed levels against which a language learner’s performance is judged... (it) provides an operational definition of a linguistic construct such as proficiency. Typically such scales range from zero mastery through to an end-point representing the well-educated native speaker. The levels or bands are commonly characterised in terms of what subjects (test-takers) can do with

language… and their mastery of linguistic features” (Davies et al., 1999, p.153). Thus, it is an instrument that raters use to evaluate the quality of writing more objectively and consistently using a specific set of criteria as descriptors, as opposed to basic intuitive scoring

38 | P a g e (Crusan, 2014; Fulcher, 2010; Weigle, 2002). A descriptor is “a prose description of a level of

performance on a scale” (Fulcher, 2010, p.320). So a rating scale is an ordered series of descriptors, usually between 3 and 9, that guides raters during the rating process (McNamara, 2000, p.40-1). There are three main types of rating scales that are used when assessing writing: primary trait scales, holistic scales, and analytic scales (Montee and Malone, 2014; Ferris and Hedgcock, 2014; Weigle, 2002). Primary trait scales are scales that are “used to assign a single score based on one trait of the performance” in relation to a specific task (Montee and Malone, 2014, p.5; Weigle, 2002). For example, a scale is developed for raters to award a single score to one aspect of performance they feel is most important in the given task, like persuasiveness in an argumentative essay. Thus, each task has its own primary trait scale (Weigle, 2002). This approach to scoring written work is extremely time-consuming (Montee and Malone, 2014; Weigle, 2002) and the scores cannot be generalized (Shaw and Weir, 2007). In addition, Weigle believes that this type of scale has not been widely used in second language (L2) writing assessment, and that there is a dearth of literature on how this may be implemented in L2 writing assessment (p.110). For these reasons, holistic and analytic scales are the more popular scales used when scoring writing (Monte and Malone, 2014). Holistic scales, also known as global scales, allow raters to assign a single score to the written performance as a whole (Ferris and Hedgcock, 2014; McNamara, 2000). White (1985) argues that this type of scale allows “quick, economical, and reasonably reliable rankings” of a large number of written samples (p.31). He also argues that this type of scale focuses on the strengths of a written script, rather than the deficiencies. As a result, test-takers are rewarded for what they did well (Weigle, 2002). However, this type of scale also has its disadvantages. A single score: (1) does not provide adequate diagnostic feedback to students (Ferris and Hedgcock, 2014; Montee and Malone, 2014); (2) is difficult for stakeholders to interpret (Bachman and Palmer, 2010; Ferris and Hedgcock, 2014); and (3) does not take into consideration that students’ proficiency in the various sub-skills of writing vary, that is, students may have different proficiency levels in different criteria (Bachman and Palmer, 2010; McNamara, 2000; Weir, 2005). Weigle (2002) and Weir (2005) state that writers, especially L2 writers, develop different writing skills at different rates. Though the practicality of these scales makes them popular in many assessment settings (Montee and Malone, 2014), analytic scales are far more suited for writers in general and L2 writers in particular (Bachman and Palmer, 2010; Weigle, 2002; Weir, 2005).

Analytic scales allow raters to assign a separate score to various aspects of performance such as coherence, cohesion, vocabulary and grammar, etc., rather than a single overall score (Ferris and Hedgcock, 2014; Montee and Malone, 2014). This type of rating scale provides more specific

39 | P a g e and is more suitable in high stakes assessment settings (Montee and Malone, 2014; Weigle, 2002). Moreover, because each writing skill is scored separately, this type of scale overcomes the

limitations of holistic scales which do not take into consideration various proficiency levels within a single performance (Ferris and Hedgcock, 2014; Hamp-Lyons, 1991; Weir, 2005). Thus, it is more suitable for the assessment of L2 writers, who demonstrate a “marked or uneven profile across different aspects of writing” (Weigle, 2002, p.120). This type of scale is also said to be more reliable than holistic scales (Ferris and Hedgcock, 2014; Hamp-Lyons, 1991; Van Moere, 2014; Weigle, 2002). Inexperienced raters, furthermore, find this scale easier to use than holistic scales (Weigle, 2002; Weir, 1990).

Analytic scales do, however, have a number of limitations. They are much more time-consuming since raters are required to attend to numerous features of writing (Montee and Malone, 2014). Moreover, it is argued that some experienced raters display a halo effect when using the analytic scale; they form an overall (holistic) impression of the written script and then score every feature of the analytic scale in accordance with their overall (holistic) impression (Weigle, 2002; Weir, 2005). Unlike holistic scoring, it is argued that analytic scoring is not a natural process; readers do not naturally read a script while paying attention to particular features of the writing (White, 1995). Another limitation is that these scales may influence raters when scoring scripts that exhibit more clearly the features (criteria) of the analytic scale (Ferris and Hedgcock, 2014, p.212). Thus, if a written script exhibits bad grammar, raters may show bias in scoring other features of the analytic scale. This is related to the ‘halo effect’ that Weigle (2002) mentioned when raters used the analytic scale (see section 2.8.1).

A summary of the differences between holistic and analytic scoring, adapted from Weigle (2002, p.121), is presented in table 2.2. For further discussion, see Harsch and Martin (2013) and Knoch (2009).

40 | P a g e

Quality Holistic scale Analytic scale

Reliability Lower than analytic, but acceptable

nonetheless.

Higher than holistic.

Construct validity

Assumes that all aspects of writing ability (idea development, coherence, cohesion, vocabulary, grammar, mechanics, etc.) develop at the same rate, and can thus be captured in a single score.

Caters for writers whose performance levels vary in terms of different criteria (Weir, 2005, p.189).

Practicality Faster and cheaper than analytic

scoring

More time-consuming and expensive than holistic scoring.

Impact

Scores may mask uneven writing abilities (skills), and thus may not be appropriate for placement purposes.

Provides more diagnostic information for placement and/or instruction.

Provides more detailed profile of writers’ strengths and weaknesses (Weir, 2005). More useful for inexperienced raters.

Authenticity

Reading holistically is more natural than analytically (white, 1995).

Raters may read holistically and then adjust analytic scores to match holistic impression. The rating of one criterion may have a knock-on effect in the rating of the next criteria (Weir, 2005).

Bias

Less bias since raters rate a written script as a whole, without any focus on the sum of its parts (Ferris and

Hedgcock, 2014).

May unfairly bias raters in favour of scripts exhibiting features that are easily identified on the rating scale (Ferris and Hedgcock, 2014, p.212)

Table 2.2 Summary of holistic and analytic rating scales advantages and disadvantages.

Even though rating scales, especially analytic scales, can improve scoring validity, McNamara (1996 and 2000) argues that raters can still differ in their interpretation of the descriptors on the rating scale. These differences between raters result in rater variation. The next section (2.7) covers rater variance in ‘direct’ assessment and sheds light on how raters may vary despite using the same rating scale to score the same script.

In document Marzola, María Cecilia (página 30-98)

Documento similar