REVISIÓN DE LA LITERATURA EN CONTABILIDAD DE GESTIÓN
2.4. El nacimiento de la contabilidad de gestión como disciplina
3.2.1 Hearing aids
Although hearing aid digital signal processing (DSP) is generally very successful in facilitating accurate speech recognition, it may be less helpful for making judge-ments of emotion from speech. Several studies have documented poorer performance in emotion recognition by individuals with hearing loss listening with hearing aids, compared to age-matched normal-hearing controls (Most, Weisel, and Zaychik, 1993;
Most and Aviner, 2009; Rigo and Lieberman, 1989). In a recent study addressing this phenomenon, Goy, Pichora-Fuller, Singh, and Russo (2016) found that listen-ers were, unsurprisingly, significantly better at word recognition when listening with their hearing aids vs. unaided; however, there was no such benefit for the recogni-tion of emorecogni-tions. That is, hearing aids were neither detrimental nor conducive to accurate emotion perception. Additionally, by contrast to normal-hearing subjects, past research has shown that hearing aid users do not show improvement in emotion recognition when presented with audio and visual information, as opposed to purely visual information (Most et al., 1993; Rigo and Lieberman, 1989). This implies that either: A) the HA does not provide any helpful auditory cues to emotion, once vi-sual cues have been considered (or perhaps provides unhelpful or conflicting cues), or B) HA users assumed that the auditory cues to emotion would not be useful, and therefore focussed solely on the visual information.
Recently however, (Schmidt et al., 2016) has pointed out that several of the above studies specifically investigated emotion perception by children with HAs, and that the results might be less applicable to adults. The principle reason for this is that adults may have acquired hearing impairment later in life, and therefore had the benefit of normal hearing when learning about the relationships between different emotional states and configurations of acoustic features Schmidt et al. (2016). Of the studies carried out with adults, results have been less consistent than for children.
For example, Rigo and Lieberman (1989) found that low frequency hearing losses in particular were associated with greater difficulty in emotion perception tasks.
Conversely, Orbelo, Grim, Talbott, and Ross (2005) found that, for elderly subjects, extent of age-related hearing loss was not a significant predictor of performance in the perception of emotional speech. Additionally, with respect to perception of arousal and valence in emotional speech, Schmidt et al. (2016) found that HA users did not differ significantly from NH listeners, although HA users’ arousal ratings reflected a slightly greater sensitivity to small differences in intensity.
Specifically considering musical emotion, the difference between HA users and NH listeners appears to be somewhat stronger, although the number of studies is too small to be certain. In a pre-validated five-alternative forced-choice paradigm (anger, fear, happiness, sadness, tenderness), Russo and Fanelli (2016) documented signifi-cantly worse emotion recognition accuracy for HA users, relative to an NH control group. Interestingly however, performance did not differ significantly between HA users and non-aided HI listeners (Russo & Fanelli, 2016). Therefore, with respect to this paradigm, there is clearly improvement to be made for HI listeners, and this is not currently being achieved by the use of HAs. The authors note that this need not necessarily denote a failure of the HA per se, but might instead be a consequence of
DSP geared primarily towards speech perception.
In summary, emotion perception by HA users appears to be impaired relative to NH listeners, but the exact extent of this deficit is unclear, and is complicated by the fact that studies have included both children and adults, leading to inconsistent results.
Since at least a handful of studies have shown no (or very little) significance difference between HA and NH listeners, it is probable that any differences are relatively small.
There has been less research into emotion perception in music, but the available evidence suggests a more apparent deficit here. This is most likely because HAs are not always well-optimised to deal with music as an input.
3.2.2 Cochlear implants
CI users typically perform relatively poorly on tasks requiring sensitivity to pitch-dominant features of speech (Chatterjee et al., 2015) and music (Kong, Mullangi, Marozeau, and Epstein, 2011; Tao et al., 2015), including vocal and musical expres-sion of emotion (Luo, Fu, and Galvin, 2007; Nakata et al., 2012; Volkova et al., 2013).
As with HA users, emotion perception via other modalities is preserved (Hopyan-Misakyan, Gordon, Dennis, & Papsin, 2009), but appears not to be enhanced by simultaneous auditory input (Most & Aviner, 2009). In most cases however, CI users are able to identify basic emotions in speech at a level significantly greater than chance, particularly where exaggerated acoustic cues are used (Chatterjee et al., 2015) or response options are limited, e.g. to a binary happy or sad judgement (Volkova et al., 2013). In addition, the ability of both CI users and NH participants listening with CI-simulation to discriminate between different talkers suggests some residual sensitivity to speaker-specific phonetic detail (van Heugten, Volkova, Trehub,
& Schellenberg, 2014). In music also, CI users are able to perceive emotion at a level
greater than chance. In fact, research has reported recognition accuracy as high as 87.5% correct, when using binary happy/ sad judgment paradigms (Hopyan, Gordon, and Papsin, 2011; Hopyan, Manno III, Papsin, and Gordon, 2015) – for comparison, emotion recognition accuracy has been estimated at 84% (House, 1994) to 90% cor-rect (Luo et al., 2007) in NH listeners. In another recent experiment, Ambert-Dahan, Giraud, Sterkers, and Samson (2015) demonstrated CI users’ above-chance perfor-mance in emotion discrimination of short musical excerpts, using a forced choice of:
fear, happiness, peacefulness or sadness, in addition to generic arousal and valence ratings.
It is possible that this performance might arise as a result of compensatory attention to relatively preserved psychoacoustic features. For example, Shannon, Zeng, Ka-math, Wygonski, and Ekelid (1995) demonstrated that, when spectral information is artificially attenuated, normally-hearing listeners can decode speech accurately by attending primarily to its temporal features. Recently, Tao et al. (2015) suggested that CI users may utilise similar compensatory strategies to achieve adequate per-formance in lexical tone perception and Meng, Zheng, and Li (2016) showed that tone recognition in Mandarin might be improved by the use of an algorithm to arti-ficially represent f0contour information as loudness variation. Additionally, research suggests that both adult CI users and NH controls listening to CI simulated speech and music similarly shift their attention away from pitch-based features and towards relatively preserved acoustic features, such as intensity and timing/ rate-based cues (Peng, Lu, and Chatterjee, 2009; Peng, Chatterjee, and Lu, 2012). Giannantonio, Polonenko, Papsin, Paludetti, and Gordon (2015) showed that children using CIs tend to rely more heavily on temporal rather than mode-based cues when decod-ing emotion in music, though the tendency to focus on mode was more prevalent in
participants to whom residual acoustic information was available (i.e. via a contralat-eral hearing aid). NH participants listening via NBV-based CI simulation displayed a similar reliance on temporal cues, regardless of inter-individual differences in musical training. Likewise, Caldwell, Rankin, Jiradejvong, Carver, and Limb (2015) found that adult CI users tended to base judgements of musical emotion on tempo rather than mode, significantly more so than a group of NH controls.
The results described above accord well with the observations that, in both speech and music, temporal and intensity-related acoustic cues are relatively well preserved in CI users (Volkova, Trehub, Schellenberg, Papsin, and Gordon, 2014; Hopyan, Peretz, Chan, Papsin, and Gordon, 2012; Shannon, 1989; Shannon, 1992), since these features are relatively well delivered by electrical stimulation (Cooper, Tobey, and Loizou, 2008; Drennan and Rubinstein, 2008). Encouragingly, these results sug-gest that accurate auditory perception of emotion (or rather, accuracy comparable to NH listeners) may be an achievable target for CI users. To this end, recent research has suggested that children implanted at a very early age and receiving intensive rehabilitation may be able to reach levels of emotion detection performance equiv-alent to those observed in NH controls (Mildner & Koska, 2014). Once again, the researchers speculated that different recognition strategies were employed for the two groups, due to discrepancies in confusion matrices indicating diverging patterns of errors made. Thus, by adapting divergent listening strategies, CI users may have the potential to perceive emotion auditorily at a level comparable to the NH population.
In summary, emotion perception is, on the whole, noticeably impaired in CI users.
Relative to both NH listeners and to HA users, CI users tend to decode emotional expression in both speech and music with reduced accuracy. The level of performance that is obtained by CI users – which is typically above chance-level – is likely to be
achieved by a different underlying listening ‘strategy’, in which relatively preserved components of the auditory signal are preferentially attended to. This is encouraging in terms of clinical rehabilitation, and suggests that a level of performance compara-ble to NH listeners might be achievacompara-ble. However it remains to be seen exactly how this performance might be realised, and whether this could apply in more challenging emotion recognition paradigms.
Addressing some of the gaps in the literature reviewed thus far, Chapter 4 introduces the first two empirical experiments – behavioural listening studies examining the per-ception of emotion in both speech and music by CI-simulated listeners. Specifically, these experiments aimed to elucidate the listening strategies underpinning above-chance performance by CI users in emotion perception tasks. In the next chapter, the rationale for these studies is outlined in more depth, the experimental paradigm (including the use of CI simulation) is described in detail, and the results from both studies are discussed.