Some of the limitations of the binary decision framework are resolved by classical probability scales, in which the expert expresses conclusions in terms of the gradient
probability of the samples containing the voice(s) of the same or different speaker(s) given the evidence. An example of such a scale is in Table 2.1. Gold and French (2011)
report that classical probability scales are the most commonly used framework for FVC evidence, accounting for 13 of the 34 (38%) practitioners surveyed. This approach is
used worldwide (including Europe, USA, Brazil, South Korea and Australia) and is typically employed by experts using auditory and acoustic analysis (§1.1.3).
Table 2.1: Example of a classical probability scale for FVC conclusions (Broeders 1999: 129; equivalent to that in Baldwin and French 1990: 10)
Positive identification Negative identification
sure beyond reasonable doubt probable there can be very little doubt quite probable
highly likely likely
very probable highly likely
probable quite possible
possible
. . . that they are the same person . . . that they are different people
2.1.2.1 Issues with posterior probability
Both the binary decision and the classical probability scale frameworks have been
criticised within the field of FSS (Broeders 1999, 2001; Champod and Evett 2000; Champod and Meuwly 2000) and the wider forensic community (Robertson and Vi-
gnaux 1995b). The primary criticism is that these frameworks are based on posterior probability involving an assessment of the probability of the propositions given the
evidencep(H|E). However, posterior probability is ultimately an issue for the trier-of- fact, as it is equivalent to an assessment of the probability of the innocence or guilt
of the suspect based on the evidence. The overlap between the expert and trier-of- fact when expressingp(H|E)conclusions is most evident where the offender sample constitutes the crime, meaning that propositions are formulated at the offence level (Lucy 2005: 118). Labov (1988; see also Labov and Harris 1994) reports a case in
which a baggage handler was accused of making threatening telephone calls to Los Angeles airport. Based on auditory and acoustic analysis, Labov concluded that the
voices in the samples belonged to different speakers, and the suspect was subsequently found innocent. However, such a categorical decision is directly equivalent to the
trier-of-fact’s assessment of the innocence of the accused.
Furthermore, in order to determine posterior probability the expert requires access to
information “from sources other than an objective scientific evaluation of the (suspect) and (offender) samples” (Morrison 2009c: 4). That is, to assess the likelihood of
the suspect and offender being the same or different individual(s), it is necessary to have access to all of the evidence presented to the court, such as whether the
suspect was in the country at the time or whether they had an alibi. Such information should theoretically only be available to and assessed by the trier-of-fact. Even if such
knowledge is available to the expert, it is not the expert’s role to evaluate it. It is also essential that the other evidence in the case does not influence the expert’s conclusion,
even subconsciously or inadvertently.
Finally, conclusions expressed as a binary decision or using a classical probability
scale only account for the probability of one proposition (usually the prosecution proposition). However, only with an assessment of the likelihood of the evidence under
both the prosecution and defence propositions is the trier-of-fact able to evaluate its strength with regard to innocence and guilt. To consider only one proposition is also
inconsistent with the objective responsibility of the expert to aid the court. Therefore, it is preferable to use a framework which considers the strength of the evidence under
the competing propositions rather than the probability of the propositions themselves. This is emphasised by the ruling inR v Doheny and Adams[1996], which states that
who left the crime stain” (Rose 2007b).
2.1.3
UK Position Statement (UKPS)
To address these issues, French and Harrison (2007) present an alternative model for evaluating FVC evidence, now often referred to as the UK Position Statement (UKPS).
UKPS is the result of debate within a sub-section of the FSS community (French 2005; French and Harrison 2006) regarding the appropriateness of classical probability scales,
which until 2007 had been the dominant framework for expressing conclusions in UK casework.
Figure 2.1: Flow chart of the UK Position Statement framework for FVC evidence (from Rose and Morrison 2009: 143)
UKPS consists of a two-stage evaluation (Figure 2.1). The first stage requires an
assessment of the similarity between the suspect and offender samples, termed the consistency judgement. It allows experts to reach one of three mutually exclusive
conclusions: consistent, not consistent or no decision. According to French and Harrison (2007), anot consistent verdict should be preferred unless the differences
between the samples can be explained by “established models of acoustic, phonetic or linguistic variation” (p. 141). If the two samples are judged to be consistent, the
expert moves to the second stage, termed the distinctiveness judgement. This is an assessment of the typicality of the shared features across the samples within the wider
population since, as Nolan (2001) states, strength of evidence is dependent on “whether the values found matching . . . are vanishingly rare, or sporadic, or near universal in the
general (relevant) population” (p. 16). Distinctiveness is classified using the following five-point scale:
5. Exceptionally distinctive - the possibility of this combination of features being shared by other speakers is considered to be remote
4. Highly distinctive
3. Moderately distinctive
2. Distinctive
1. Not distinctive
from French and Harrison (2007: 141)
Distinctiveness is, for the majority of variables, assessed qualitatively. That is, while the
analysis of the samples may involve quantification of acoustic variables, their typicality is assessed based on the expert’s knowledge and professional experience, or with
reference to published studies of sociolinguistic variation. When applying the UKPS, the “general (relevant) population” (Nolan 2001: 16) used to assess distinctiveness is
defined according to the regional and social groups to which the expert believes the offender belongs.
UKPS has been signed by 25 forensic practitioners and interested academics. According to Gold and French (2011), UKPS is currently employed by 11 (32%) of the 34
practitioners surveyed and has largely replaced classical probability scales in the UK. With the exception of one expert, the combined auditory and acoustic approach is the
2.1.3.1 Limitations of UKPS
Although UKPS represents a shift away from posterior probability, there remain logical
shortcomings with the approach, as raised in Rose and Morrison’s (2009) response to French and Harrison (2007). Firstly, consistency and distinctiveness are analysed
on different scales, meaning that it is difficult to interpret the relative similarity and typicality of the suspect and offender samples. Secondly, the scales are categorical with
a finite number of potential outcomes and are serially ordered such that distinctiveness is only assessed if the samples are judged to be consistent with each other (issues with
similar two-stage approaches are discussed in Evett 1991: 10-11). This is problematic since it prohibits the gradient assessment of the strength of the evidence under the two
competing propositions in all cases. Thirdly, the categorical, binary outcome of the consistency judgement introducescliff-edgeeffects into the analysis. Anot consistent
judgement is also equivalent to an assessment of the propositions given the evidence (i.e. the samples contain the voices of different speakers). Finally, Rose and Morrison (2009)
state that it is not clear how the analysis of multiple variables should be combined using UKPS.
However, the overarching criticism of UKPS in Rose and Morrison (2009) is that it falls short of either a conceptual or numerical implementation of the Bayesian LR
(discussion of Frenchet al.’s 2010 rejoinder to Rose and Morrison 2009 is at §2.2.4).