• No se han encontrado resultados

La contabilidad de costes en el siglo XIX

REVISIÓN DE LA LITERATURA EN CONTABILIDAD DE GESTIÓN

2.3. La contabilidad de costes en el siglo XIX

Fortunately, research concerning the transmission of emotion via the auditory do-main has typically placed primary emphasis on speech prosody and music, due to their prevalence in social situations and ability to convey emotion effectively and quickly (Coutinho and Dibben, 2013; Bhatara, Laukka, and Levitin, 2014). Commu-nication via this modality is an important aspect of social interaction (Ekman, 1992) and, in most individuals, develops at a young age. Indeed, five-month-old infants demonstrate rudimentary sensitivity to vocal expression of emotion (Fernald, 1993).

In normally-hearing children, vocal emotion recognition is typically well-developed by five years old (Sauter, Panattoni, & Happ, 2013), while musical emotion recogni-tion tends to emerge slightly later, by around six to seven years, as children develop awareness of culture-specific musical structures, e.g. Western major and minor tonal-ity (Trainor & Corrigall, 2010).

In both speech and music, emotion is communicated through subtle fluctuations in the auditory signal, characterised as variations in different psychoacoustic attributes, that are associated with the affective state of the speaker or performer. In speech and vocal music, for example, emotion-specific acoustic variance is produced by physi-ological changes in respiration (the flow of pulmonic air), phonation (modification of the air stream by the larynx) and articulation (modification of the air stream by the lips, teeth, tongue etc.) that accompany one’s emotional state (Scherer, 1986;

Scherer, 2003). Indeed, using electroglottography (EGG) Johnstone and Scherer (1999) showed that distinct emotional states are associated with different patterns of vibratory vocal fold activity. Therefore, at least to some extent, one’s emotional state may be disclosed by so-called ‘gestural primitives’ associated with the production of sound, i.e. the articulatory mechanisms underlying variations in the acoustic signal

(Rosenblum, 2004). In a Gibsonian sense, this means that characteristic acoustic feature configurations associated with different emotional states may arise directly from the physiological bases of these states (Gibson, 1966). As evidence that lis-teners are sensitive to such information, Ladefoged and McKinney (1963) showed that loudness judgements were predicted more accurately by using an index of vocal effort (subglottal pressure multiplied by air velocity across the glottis) than by us-ing the actual intensity of the acoustic signal. In the case of instrumental music, a performer’s affective state may instead be disclosed via their interaction with their instrument. For example, Gabrielsson and Juslin (1996) found that listeners were sensitive to small modulations in several musical parameters (e.g. tempo, dynamics), which were used to express emotion for a range of different instruments.

It should of course be noted that emotions can also be expressed in the absence of genuine underlying affective states (and their associated physiological indices). This is evident in the case of acted speech (which, in fact, makes up many databases of emotional speech stimuli, e.g. Burkhardt, Paeschke, Rolfes, Sendlmeier, and Weiss (2005)), and is especially important in the performance of music (Lindstr¨om et al., 2003). In both scenarios, one can express an emotional state simply by knowing how this state would usually affect an individual’s speech or musical performance, and mimicking these effects. However, there may be individual differences regarding the extent to which emotion in music is ‘embodied’ by the musician, as opposed to merely ‘performed’ (Van Zijl & Sloboda, 2011). Of course, unlike speech, mu-sic can also express emotion via higher-level compositional cues, (e.g. tonality), which are essentially unrelated to the physical manifestations of emotional states.

To maintain comparison with speech, however, this thesis is primarily concerned with performance-based cues.

It is argued that associations between patterns of modulations in these acoustic parameters and distinct emotional states may constitute a universal in human com-munication (Scherer, Banse, and Wallbott, 2001; Thompson and Balkwill, 2006).

Indeed, evidence has shown that different acoustic configurations are employed to communicate specific emotions during speech, irrespective of syntactic or semantic context, and that these configurations are very similar across different individuals (Cosmides, 1983). However, more recent research has acknowledged that, at least in the case of music, there may also be cross-cultural differences in terms of the ‘weight-ing’ given to specific acoustic parameters when identifying an emotion (Lennie, 2017).

Importantly, in any case, research shows that the acoustic profiles characterising distinct emotions in speech and music reliably lead to perceptual bias towards the relevant emotion in recognition tasks (e.g. Mozziconacci and Hermes, 1999). In other words, distinctive sets of acoustic attributes used by talkers/ musicians to convey specific emotions are ‘recognised’ by listeners, and are predictably associated with perception of the corresponding emotions. As such, it is possible for computer models to mimic human performance relatively closely in such emotion discrimination tasks, by utilising statistical pattern recognition to extract relevant prosodic features from the acoustic signal, and learning to associate these with emotional states (Dellaert, Polzin, and Waibel, 1996; Petrushin, 1999).

In speech, with respect to specific cues, Petrushin (1999, p. 4) noted that ‘All stud-ies in the field point to the pitch (fundamental frequency) as the main vocal cue for emotion recognition’. This being said, in addition to frequency, emotional ex-pression in speech is typically thought to be comprised of paralinguistic variance in intensity, temporal pattern and spectral content (Juslin and Laukka, 2001; Scherer, Banse, Wallbott, and Goldbeck, 1991). The latter of which may be further

subdi-vided taxonomically into timbre- and dissonance-based cues (Coutinho and Dibben, 2013; Gabrielsson and Lindstr¨om, 2010). Indeed, such variables as sharpness (the proportion of high frequency, relative to low frequency, content in a signal) and spec-tral centroid (the weighted mean of a signal’s constituent frequencies) demonstrably influence ratings of stimuli arousal-level and valence (Coutinho & Dibben, 2013).

Importantly, each of the aforementioned cues may be differentially important for the decoding of emotional expression, depending upon the emotion to be identified (Banse & Scherer, 1996). For example, the expression of fear is most often associated with a higher than average mean pitch, whereas the expression of sadness may be characterised by a lower mean pitch and also a reduced speech-rate.

As in speech, musical emotion is conveyed by variations in structural acoustic fea-tures (Juslin & Sloboda, 2010), which may be recognisable universally (Balkwill, Thomspon, and Matsunaga, 2004; Fritz et al., 2009). Research has suggested that common acoustic features are implicated in the processing of prosodic structure in both speech and music (Fedorenko, Patel, Casasanto, Winawer, and Gibson, 2009;

Falk, Rathcke, and Dalla Bella, 2014). Furthermore, specific emotional states, for example sadness, may sometimes be communicated by highly similar patterns of acoustic variation in both speech and music – i.e. not only are similar features utilised for the communication of emotion in the two domains, these features also appear to be utilised in similar ways (Curtis & Bharucha, 2010). As an example of this, anger is typically characterised by increased tempo or speech rate, greater inten-sity and greater high-frequency content in both speech and music (Juslin & Laukka, 2003). Similarly, increased intensity in both speech and music generally results in greater perceived valence and tension (Ilie & Thompson, 2006). However, Ilie and Thompson also documented some contradictory examples – increased pitch height

appeared to be associated differentially with greater valence ratings in speech and lower valence ratings in music, while slow rate was associated with positive valence for speech but had no clear effect for music. To some extent, discrepancies such as these may reflect the fact that music (typically more so than speech) is considered a creative art form. Therefore it may be that variation in emotional expression is a part of this art. That is, emotional expression may interact with and be influenced by more general artistic expression on the part of the performer, in a way that does not usually occur in speech. In fact, since this aspect of music tends to be ignored in experimental contexts, it is possible that the overlap between speech and music – in terms of the ways that acoustic cues are utilised for emotional expression – has been somewhat exaggerated.

In any case, as in speech, emotion in music appears to be principally communicated by variance in psychoacoustic cues that fall into four broad categories: frequency, intensity, tempo/ rate and spectral composition (with higher-order musical features such as articulation being explainable in terms of these more basic features). Reflect-ing this, a recent computational modellReflect-ing study investigatReflect-ing the acoustic features associated with emotion communication used overlapping input vectors for predicting emotion in both speech and music (Coutinho & Dibben, 2013). For both, informa-tion about loudness, tempo/ rate, pitch contour, spectral centroid and sharpness were deemed to be important – in fact, the speech and music models only differed with respect to the contribution of cues related to spectral content, with rough-ness (associated with narrow harmonic intervals), for example, being relatively more important in speech than in music (Coutinho & Dibben, 2013).

In addition to the above, there is also evidence that training in music may benefit individuals’ ability to decode emotion in speech, (Strait, Kraus, Skoe, and Ashley,

2009; Thompson, Schellenberg, and Husain, 2004) – an effect that is demonstrable even in hearing-impaired populations (Good et al., 2017). That is, the acoustic com-monalities shared by speech and music in the expression of emotion (i.e. Bregman’s (1994) ‘primitive’ processes) are substantial enough such that training in emotion-decoding with one medium confers a benefit when tested with the other.