3. BULIMIA NERVIOSA
3.3. EPIDEMIOLOGÍA DE LA BN
3.3.2. PREVALENCIA BN
2.2.1 Hearing aids
To answer the question of how well hearing-impaired listeners perceive speech, the primary tool available is known as speech audiometry. Broadly speaking, this con-sists of measurement of an individual’s ability to recognise and understand excerpts of speech, as a means of evaluating their hearing. For at least seventy years, vari-ants of speech-based audiometry have been incorporated within screening and fitting processes for HAs, beginning with simple aided vs. unaided word-recognition tests (Carhart, 1946). In 2005, Kirkwood found that 92% of audiologists reported using
some form of speech audiometry. In short, accurate perception of speech has long been both a primary goal, and an important method of assessment, for HAs.
The benefit for speech recognition accuracy with HAs, compared to without, is highly variable, based on the duration and extent of hearing loss, individuals’ cognitive abil-ities, and the specifics of the HA used. When listening in a quiet setting, it is possible for HA users, like NH listeners, to achieve speech recognition performance of 90-100%
(Metselaar et al., 2008). For each individual however, the overall improvement in speech recognition, aided vs. unaided, may vary from approximately 3% to 20%, with poorer unaided performance typically predicting greater improvement (Metselaar et al., 2008).
Despite this, many HA recipients report imperfect speech understanding (especially in noisy environments), and in fact cite this as a reason for rarely using their devices (Kochkin, 2000; Lupsakko, Kautiainen, and Sulkava, 2005). A more recent survey has suggested that end-user satisfaction is increasing slowly over time, although listening in noisy environments continues to pose a significant challenge (Kochkin, 2010).
However, as modern HA technology continually improves, it is conceivable that speech recognition with HAs could achieve parity with normal hearing, even in more challenging listening scenarios. Indeed, in two recent independent trials, Powers and Fr¨ohlich (2014) reported that HA users with mild-moderate loss, equipped with a novel binaural beamforming (spatial filtering) technique, were able to outperform NH control subjects in a speech recognition in noise task, by 2.1-2.9 dB SNR. Simi-larly, the emerging use of deep neural networks (DNNs) may offer a solution to the so-called ‘Cocktail party problem’, by allowing HAs to effectively isolate relevant speech sounds from various kinds of competing signals and/or background noise (Wang, 2017). By having the HA ‘learn’ how to accomplish this feat, rather than
implementing a specific, predefined signal processing algorithm, it is able to per-form well even when faced with background noises that have never been encountered before, thereby achieving a much more ‘human-like’ level of performance (Wang, 2017).
Lastly, consideration should be given to the non-linguistic elements present in speech, often collectively referred to as ‘prosody’. Prosodic elements of speech may carry im-portant information about an individual’s tone of voice, or emotional state, but are characterised by relatively subtle variations in the acoustic signal, and therefore can present difficulty for HA users (Most & Aviner, 2009). Of consequence here, for example, is that HA amplification occurs over a limited range of frequencies (up to a maximum of approximately 100 Hz to 10 kHz), relative to the range that is audible by the human ear (Anders H. Jessen, 2014). Moreover, hearing loss is rarely uniform across the frequency spectrum – often individuals have relatively good residual hear-ing at lower frequencies and very little at higher frequencies, i.e. ‘useable’ hearhear-ing is restricted to a reduced bandwidth (Davis, 2004). For this reason, frequency com-pression is often used, so that a greater proportion of sound is amplified. However, since frequency information is not preserved verbatim, the process risks introducing distortion, particularly in the higher-frequency harmonics of a signal, which may be perceived as dissonance (Uys et al., 2012).
Therefore, even if an HA user is able to recognise speech with one hundred per-cent accuracy, they might still miss out on or misinterpret essential non-linguistic information, which could alter the meaning of a given utterance (e.g. in the case of irony). For example, research has documented impairment for HA users, relative to NH participants, in tasks involving the detection of a talker’s emotional state (Most
& Aviner, 2009), or whether their intent is sarcastic or sincere (Stiles, 2013).
In summary, HA users are typically impaired to some extent in speech recognition, especially when listening in a noisy environment. With this said, HA technology is constantly improving, meaning that the best performance attainable is becoming closer to that of a normal-hearing listener, even when listening to speech in noise.
However, while the content of speech is usually its most important attribute, there are also relevant, non-linguistic aspects of speech that have received relatively little attention, and might not be restored as adequately by the HA (Schmidt, Herzog, Scharenborg, & Janse, 2016). Therefore, even those HA users scoring within the top percentiles for speech recognition with/ without noise, it should not be assumed that perception of speech has been restored to a ‘normal-hearing’ level.
2.2.2 Cochlear implants
As with HAs, in recent years there has been tremendous progress made to improve the perception of speech by CI users. In 1977, Bilger, Black, and Hopkinson, inves-tigating the efficacy of the first generation of single-channel CIs to be fitted in the United States, reported that these prostheses were insufficient to facilitate speech understanding, but that they did appear to convey some supplementary auditory in-formation that led to significantly improved lip-reading scores. By 1995, a National Institute of Health (NIH) consensus statement on CIs was released, which proclaimed that most postlingually-deafened recipients of modern CIs would be expected to score 80% or above when tested for understanding of high-context sentences, presented in quiet and with no accompanying visual information (Wilson, 2004).In fact, speech perception with the CI is now sufficient for many users to understand and conduct telephone conversation (Cray et al., 2004; Helms et al., 2001) – that is, to recognise speech using only auditory cues, without lip-reading.
Thirteen years after the 1995 NIH consensus, Gifford et al. (2008) presented data showing that over 25% of users achieved perfect scores in standard sentence batteries to measure speech recognition in quiet. The authors further stated that more difficult speech testing materials must be developed, due to the prevalence of patients scoring 90-100% rendering it difficult to meaningfully track individuals’ progress with the CI.
This report was considered a major milestone in the development of CI technology to facilitate speech perception (Wilson & Dorman, 2008).
With this being said, just as with HAs, performance in speech recognition with the CI can be highly heterogeneous, varying according to myriad factors, including: age at implantation (Blamey et al., 1996; Geier, Barker, Fisher, and Opie, 1999; Shipp and Nedzelski, 1995), duration of deafness prior to implantation (Blamey et al., 1996;
Geier et al., 1999; Gantz, Woodworth, Knutson, Abbas, and Tyler, 1993; Rubinstein, Parkinson, Tyler, and Gantz, 1999), degree of residual hearing (Gantz et al., 1993;
Rubinstein et al., 1999), duration of implant use (Blamey et al., 1996), and cognitive abilities (Gantz et al., 1993).
However, speech perception is much more difficult for CI users when it takes place in a noisy environment, for example in a restaurant, or a busy street. Several studies have documented drastically reduced speech reception thresholds for CI users listen-ing in noise, as opposed to quiet. For example, Tobey, Shin, Prashant, and Geers (2011) found that presenting sentences with accompanying multi-talker babble (i.e.
competing, usually unintelligible, speech from multiple voices) significantly reduced the speech intelligibility scores of adolescent CI users.
CI users also tend to be somewhat impaired in the perception of prosodic elements of speech: an impairment that appears to be relatively independent of performance in speech recognition (Chin, Bergeson, & Phan, 2012). Nakata, Trehub, and Kanda
(2012) found that children who used CIs performed significantly poorer than NH controls in both perception and production tasks related to speech prosody. Simi-larly, in a study that included CI and HA users, Kalathottukaren, Purdy, and Ballard (2017) found that both groups achieved significantly lower scores, compared to their NH peers, on two standardised batteries designed to assess perception of speech prosody (Nowicki and Duke, 1994; Pepp´e and McCann, 2003). In another study involving both HA and CI users, Most and Peled (2007) observed that the latter group performed significantly worse on tasks involving the identification of stress and intonation patterns in spoken sentences. Lastly, at least two studies have shown that CI users have some level for difficulty in classifying utterances as statements or questions. Peng, Tomblin, and Turner (2008) found that children with CIs scored an average of 70% correct in a question/ statement discrimination paradigm, com-pared to 97% for NH controls. Likewise, in a similar experiment with adults, Green, Faulkner, Rosen, and Macherey (2005) reported an average of 69% correct for CI users.
To summarise, in recent years, enormous progress has been made towards improving the quality of speech perception of CI users. Because of this, it is now possible for some users to perform relatively well in speech recognition tests, even without lip-reading, and even in the presence of background noise. Unfortunately, in the majority of cases, speech perception performance appears to be significantly worse with the CI than with the HA. This appears to be compounded further by a greater difficulty in perceiving paralinguistic auditory attributes present in speech, facilitating recogni-tion of the talker’s tone of voice, or emorecogni-tional state, for example. Just as with HAs, emerging technological advances paint an optimistic picture of the CI landscape for the near future, with automatic auditory scene classification being implemented in
order to improve the perception of speech in noise (Mauger, Warren, Knight, Goore-vich, & Nel, 2014), and researchers exploring the feasibility of light-based stimulation techniques, intended to circumvent the problems of limited frequency resolution and electrical current spread, which are inherent in current CI technology (Johannsmeier et al., 2017; Parveen, 2017).