de las Telecomunicaciones - This PDF is provided by the International Telecommunication Union (

Since the turn of the century, many papers have reported that CAA use is increasing steadily (Denton et al.,2008; Keady, FitzGerald, Gamble, & Sangwin,2006; McKenna & Bull,2000; Özden, Ertürk, & Sanli,2004; Pitcher, Goldfinch, & Beevers,2002; Thelwall,

2000) and this has been attributed to the large increase of students, particularly in STEM subjects (Bull & Stephens, 1999; Davies, 2001; Gill & Greenhow, 2008; Jones,

2008; Krause, Stark, & Mandl, 2009). This claim seems to be no longer repeated in new articles and books on computer-aided assessment: development has been focused on developing the complexity and capability of those systems, rather than the spread of practice.

Miller (2009, p. 183) noted that “research on formative CBAs has focused on the development and evaluation of these assessments”, largely from the lecturers’ perspectives. These contain some details of the location and the system that is used, but few indicate the scale of the deployment in terms of cohort size or number of tests taken. There lacks

a broad study of the current uptake of CAA in mathematics courses in Higher Education in the United Kingdom.

A previous study examined the use of CAA to perform mathematics diagnostic tests. In its report, “Diagnostic Testing for Mathematics”, the LTSN MathsTEAM Project (2002, p. 6) reported that 36% of the institutions polled used such testing. This only gives an indication of the uptake of CAA: the study was performed more than ten years ago. However, the study reported that there was already a number of different systems (and systems derived from those systems) in use that are now obsolete, or have been replaced by newer systems.

This demonstrates some of the difficulty in presenting the nature of CAA use in higher education mathematics in the United Kingdom. From the literature it is apparent, yet not transparent, that the adoption of CAA is increasing. Confounding a unified understanding of the current situation is the confusion between different terms and a difficulty in differentiating between system types.

There are, however, some studies that report lecturers’ perspectives of implemented computer-aided assessment systems in mathematics higher education courses in the United Kingdom (Bull & McKenna, 2003; Greenhow et al., 2011; Jones, 2008; Pidcock, Palipana, & Green, 2004; Pitcher et al., 2002; Pollock, 2002; Ricketts & Wilks, 2002; Sangwin, 2005, 2006, 2007). Studies of this nature are useful in identifying issues in computer-aided assessment—whether confined to mathematics at higher education level or otherwise—but they often lack the student-user perspective (Walker, Topping, & Rodrigues, 2008). Some student feedback studies of computer-aided assessment have been conducted in response (Denton et al.,2008; Walker et al.,2008), but little progress has been made in such regard for mathematics computer-aided assessment.

A notable obstacle is that mathematics assessment can be regarded as a special case. One might take for granted that communicating mathematics verbally between humans is difficult. Written communication of mathematics may seem easier, but simple mistakes can proliferate. Communicating mathematics with a computer proves to be a hybrid affair, where a human must provide a “one-dimensional string” of the correct syntax in order to successfully convey the appropriate message (Sangwin & Ramsden, 2007, p. 921).

There are two solutions to this problem: either incorporate a computer algebra system (CAS) that interprets mathematical input and offers students the opportunity to review the rendering of their responses before submitting for marking; or adopt a system that does not require students to learn a syntax. The systems currently available are divided into these two categories.

Computer-aided assessment supported by a computer algebra system

With the support of a computer algebra system (CAS), a CAA system can compare a mathematical expression, provided by the student, with the stored solution for equiv- alence. That is, a student may obtain full marks for answering a question correctly without matching the stored solution verbatim (Sangwin, 2004, p. 5).

The computer algebra system determine any differences between the student’s response and the stored solution. If there is no difference, the appropriate marks are awarded. If there is a difference, the student may be awarded partial marks if it appears he or she has made a common error (Sangwin, 2004).

Such tolerances are particularly useful when the solution is algebraic in nature, given that the presentation of mathematical expressions are sometimes a matter of personal style: for example, −x2+ 1, 1 − x2 and (1 + x)(1 − x) are equivalent but, stylistically, they are different. Additionally, there may be situations in which one or more of those solutions is not acceptable, such as if the question asks to expand (1 + x)(1 − x), then clearly the last expression of the three would not be desired; in that case, such solutions can be excluded from the set of expressions that are marked correct (Sangwin, 2007). Once a question has been designed, for many CAA systems it is relatively straightforward to introduce elements of randomisation. For example, “genuinely random” polynomials of degree and coefficients within predetermined ranges can be generated (Sangwin,2004, p. 5). This has two advantages: students can practice performing differentiation on polynomials repeatedly; and it is also unlikely for two students sitting side-by-side to receive identical realisations of these randomisations, offering a barrier to plagiarism. This randomisation does not require question-writers to manually input every possible instance: it can be achieved by requesting the CAS to generate the question instance instantaneously.

Sangwin (2004) demonstrated how this is useful for generating feedback for students. In the case of integrating a function, the computer algebra system is able to differentiate the student’s response and present this to the student. This way, the student is not given the solution, but is offered a reason why his or her answer is not correct and an indication of what corrections might be required.

At the time of his conference paper, Sangwin (2007) mentioned some CAA systems that were supported by a number of different computer algebra systems. This may be a problem since students not only need to learn the material, they also need to learn the syntax. Sangwin and Ramsden (2007) noted this problem with regard to AiM. One student had commented in feedback, “I feel the aim (sic) system is reasonably fair,

however i (sic) have lost a lot of marks in quiz 3 for simple syntax errors” (Sangwin & Ramsden,2007, p. 921).

Although the CAS can indicate where it cannot understand a human response (Sangwin,

2002), it does not necessarily follow that the CAS will interpret a student’s response correctly (Sangwin & Ramsden,2007). Clearly, such a result is “unacceptable” (Sangwin & Ramsden, 2007, p. 922): the assessment becomes as much a measure of a student’s ability to converse fluently with the CAS as it is a measure of a student’s mathematical performance.

A further criticism of a CAS-supported CAA system is that there may be a financial outlay for the CAS software, even if the CAA itself is free. For example, AiM is free, but requires the purchase of a single Maple licence (Keady et al., 2006). Others, however, require no financial outlay for software, such as Numbas and STACK, but may require investment in new hardware (Keady et al.,2006).

Computer-aided assessment without a computer algebra system

For CAA systems that lack an ability to interpret mathematical input, asking questions of a mathematical nature is a challenge. Many pen-and-paper assignments ask students to answer questions that result in an algebraic (or ‘mathematical’) form that has no clear analogue when performed electronically. Consequently, questions must be reconsidered before they are generated for the CAA system (Conole & Warburton,2005).

CAA systems that are not supported by CAS cannot interpret an algebraic response. They may use other question techniques, such as multiple choice, multiple response, text or numeric input, matching, ranking and drag and drop (Conole & Warburton, 2005; Bull & McKenna,2003). Sangwin (2004, p. 5) argued that converting questions to these alternative forms “often limit or distort questions”.

There are many CAA systems used for mathematics assessment that do not rely on a CAS (Conole & Warburton, 2005). Perception, by Question Mark Computing Ltd, is a popular choice (Dermo, 2009; Green et al., 2004; Greenhow & Gill, 2005; van der Kleij et al., 2011; Martin & Greenhow, 2004; Walker et al., 2008). This commercial choice offers more test options but requires considerable initial outlay (Conole & Warburton,

2005).

Although there are several options for question types and systems, multiple choice questioning remains the prominent choice for many CAA systems (Warburton & Conole,

2003). Whether multiple choice questioning (MCQ) is appropriate for testing knowledge and understanding is an issue that is contested considerably within the literature.

Bull and Stephens (1999) referred to an earlier study performed by Scouller and Prosser (1994) in which they found that students approached multiple choice questioning differ- ently: students with deep learning intentions were more likely to adopt study strategies for MCQ examinations; students with surface learning approaches tended not to ap- proach MCQ examinations with a study strategy, and furthermore they were more likely to not fully understand the difference between “understanding” and “reproducing” factual knowledge.

Butler, Karpicke, and Roediger (2007) discussed ways in which multiple choice questioning could be improved. They found that there was no significant difference between students’ long-term retention of knowledge whether they received “standard” feedback (marking correct or incorrect and giving the solution) or “answer-until-correct” feedback (marking correct or incorrect and giving the student another attempt if they answered incorrectly until he or she answers correctly). Butler et al. (2007) concluded that delaying feedback leads to better performance, indicating an improvement in long-term retention. Bush (2001) gave further assurances that multiple choice questioning can be effective if sufficient care is used in writing questions. This effect may be enhanced with negative marking, with enabling the student to give more than one response, or with students’ self-assessment of their confidence in their solutions. He added a note of caution that such testing can cause confusion, particularly if they are not used universally.

Greenhow and Gill (2005) explained that the success of multiple choice questioning is aided by analysis of previous cohorts’ responses to similar questions. By establishing common errors that have been made by students previously, the lecturer is better able to determine the root of the mistakes. In doing so, multiple choice options can include “mal- rules”: answers generated “if the student applies sensible, but incorrect, rules [methods] of their own” (Greenhow & Gill,2005, p. 2). Furthermore, by anticipating those common errors, the lecturer can write feedback explaining the error and how this can be avoided in future. For some CAA systems, this feedback can be generated as simply and as quickly as it can generate the questions — so long as the system has been informed how to generate this type of response. Without awareness and input of the mal-rules, the CAA system is not likely to know where the errors have occurred and feedback to such detail cannot be given.

Some feel that CAA systems that rely on multiple choice questions and numerical input — which some refer to as “quizzes” — cannot test deeper understanding and cannot be relied upon for summative assessment. Indeed, both lecturers and students realise this: “[It] can be argued that such CAA tests do not adequately assess the higher level skills or understanding in mathematics. To a large extent this is true. Students are well aware of this, distinguishing as they do in feedback, between credit gained for method

and credit gained for a single numerical input” (Croft, Danson, Dawson, & Ward, 2001, p. 65, emphases in original). Conole and Warburton (2005, p. 21) agreed: “A major concern. . . is whether multiple choice questions are really suitable for assessing higher- order learning outcomes in higher education students”.

Although CAA systems, with or without a supporting CAS, offer much to both lecturers and students, it is apparent that they bring some issues. The following section highlights the persistence of these problems.

In document This PDF is provided by the International Telecommunication Union (ITU) Library & Archives Service from an officially produced electronic file. (página 89-92)