recursos didácticos - PROGRAMAS DE ESTUDIO 2011 GUÍA PARA EL MAESTRO

Firstly, LR-based analyses typically focus on a small number of acoustic variables

which yield continuous, quantitative data compared across the same word or phrase. The benefit of using the same word or phrase is that the material is directly comparable

across the evidential samples and within-speaker variability is minimised by normalising for phonological context and syntactic function. Underlying this approach is also an

assumption that a small subset of variables is able to accurately represent the properties of the suspect and offender voices. Thissamplingapproach is claimed to be akin to the

proportion of the genome analysed in forensic DNA analysis (Morrison p.c.). However, this assumption is potentially insufficient given the number of variables available to the

analyst in componential linguistic-phonetic FVC (§1.1.3) which may have a substantial effect on the resulting strength of evidence.

Gold and Hughes (2014) state that the primary reason for the focus on continuous, acoustic data is that this is the only type of data which can be handled by current

LR formulae. Speech is complex for LR modelling since it consists of numerous variables which may be continuous or discrete, normally or non-normally distributed,

display different distributions within and between speakers, and contain multiple, highly correlated features. At present, there are no means of empirically computing LRs for

discrete data such as allophonic consonantal variation (although see Schwartz et al. 2011), frequencies of lexical items and the analysis of VQ and vocal settings. As

outlined in Gold and Hughes (2014), for the numerical LR to become a more realistic proposition in FVC, it is necessary for new LR models to be developed. Only a small

amount of work has considered these issues (Aitken and Gold 2013; Foulkes et al. 2013-2015; Nairet al.2014; Neocleouset al.2014).

Secondly, speech variables display complex patterns of structured between-speaker

variation (Rose 2002; Foulkes and Docherty 2006; Frenchet al.2010). Such variation is found as a function of factors such as regional background, socio-economic class,

age, and ethnicity, as well as the social networks and communities of practice in which a speaker participates. Within a single regional variety different variables are

often stratified in different ways. For example, /u:/-fronting is a widespread change in progress in English, and is generally correlated with age. By contrast, another on-going

change, /@U/-fronting, correlates with both age and sex, being led by young females (Haddicanet al.2013; Williams and Kerwill 1999). Across regional varieties, the social

stratification of variables may also differ. For instance, /@U/ and /eI/ carry considerable socially conditioning in the north east of England, but much less in the south east

(Watt 2000, 2002).

In collecting development, test and reference data, it is important that samples match

the (relevant) facts of the case at trial otherwise the resulting strength of evidence may be misrepresentative. This is because a speaker will never produce the same

word or sentence in exactly the same way even consecutively, meaning thatp(E|Hp)

for speech evidence will never be one (unlike forensic DNA analysis). In current

LR-based research and casework, within-speaker variability is captured using non- contemporaneous samples from each speaker separated by some undefined period of

time. Non-contemporaneity encompasses multiple sources of structured and random within-speaker variability across samples. Results from Enzinger and Morrison (2012)

show that system validity and reliability are overoptimistic when using contemporaneous, compared with non-contemporaneous, samples. However, very little research

has empirically tested these issues with different variables commonly analysed in FVC (with the exception of Coe 2012).

Further, evidence from sociolinguistics and sociophonetics indicates that there are

numerous, complex sources of within-speaker variability which affect speech produc-

tion. These include interlocutor, conversational topic and function, level of formality, self-consciousness, physical setting, time of day, illness, fatigue and intoxication. In a

given FVC case, a large number of these factors are likely to be relevant: suspect and offender samples are typically recorded with different interlocutors who have a different

level of familiarity with the speaker, talking about different topics in different degrees of formality at different times of day. At the present time, no empirical research has

investigated the extent to which such factors affect LR output, or whether such factors have a much bigger effect on the resulting LRs than the use of non-contemporaneous

samples.

Finally, speech variables form highly correlated sub-systems due to a range of factors. The biological structure of the vocal apparatus means that variables such as vowel

formants are inherently interrelated within and between phonemes. Correlations are also determined by the linguistic system, such that in cases of (vowel) change there are

push-pull effects which are thought to ensure that sounds remain acoustically distinct. There is also evidence of linguistically arbitrary correlations due to social factors, for

example speakers in Derby with TH-fronting (/T D/→[f v]) also typically produce labial-r(/r/→ [V w]) (Milroy 1996). Although methods for empirically combining LRs have been developed in ASR (logistic regression fusion), Gold and Hughes (2014) argue that it is an empirical question as to whether such methods capture the linguistic-

phonetic complexity of the correlations in the raw data.

This thesis considers the implications of the complexity of speech evidence for two

specific issues in LR-based FVC: the definition of the relevant population and the collection of development, test and reference data.

In document PROGRAMAS DE ESTUDIO 2011 GUÍA PARA EL MAESTRO (página 124-126)