CAPÍTULO IV 4 ANÁLISIS E INTERPRETACIÓN DE RESULTADOS
RESULTADO DEL TEST ADAPTADO PLN 4.2 Ítem Visual.
4.2.2. Ítem Auditivo CUADRO Nº 16 Auditivo
2.4.4. The semiotics of voice 2.4.4. The semiotics of voice 2.4.4. The semiotics of voice
Peirce (1931) believed there are three kinds of signs that help us interpret “reality” around us: icons, indices and symbols. These three signs are then understood and defined in their relation to the objects they refer to. Here we find Pierce’s definitions for the three terms as explained by Pharies (1985):
Icon: a sign that stands for something merely by resemblance. The qualities of the sign excite analogous sensations in the mind for which it is a likeness.
Index: it is a sign which refers to the object that it denotes by virtue of being really affected by that object. Therefore there is generally a cause-and-effect relationship between the index sign and its object.
Symbol: a sign which stands for the object by virtue of a law or convention and thus exists only because there is an interpreter.
These definitions can perhaps be visualized better with the following table (Hirage, 2005, p. 30):
Sign Sign Sign
Sign Defining conditionsDefining conditions Defining conditionsDefining conditions Signification by virtue ofSignification by virtue ofSignification by virtue ofSignification by virtue of
Icon Similarity Its intrinsic characteristics
Index Contiguity (causal or spatial) Its existential relation to the object
Symbol Conventionality Its relation to the interpretant
Table
Table Table
Table 5555: Icon, index and symbol: Icon, index and symbol: Icon, index and symbol: Icon, index and symbol
According to these definitions the linguistic sign is said to be a symbol. However there are aspects of our language where other possibilities appear, such as an iconic quality for onomatopoeias or an indexical quality in personal pronouns. As Hirage mentions (2005, p. 31): “In the real sign-object situations, a sign may involve all three types of Peircean signs-icon, index and symbol- with predominance of one over the others.”
The human voice, by its physical (embodied) quality can be conceptualised more as an icon or an index of who we are or of what we are transmitting than as a symbol. If we now apply this to the interpretations of emotions in voice, we find that if voice is seen as an index of a given emotion then we are interpreting that the person is actually feeling that given emotion, whereas if the voice is seen as an icon , we interpret that the person is acting as if he/she is being emotional (but not really feeling it).
71 Another way of reinterpreting the idea of conceptual metaphors applied to voice is to see the case where spoken forms have a resemblance with the things they stand for as having an iconic function. In fact, Pierce classified metaphors as a subtype of icons. The term iconicity will be here used as defined by Mannheim (In Duranti, 2001, p. 102): “Iconicity is a relationship between a sign and its object in which the form of the sign recapitulates the object in some way.”
We can thus speak of an iconicity code for the voice. As Vaissière (2005, p. 251) defines it: “Iconicity involves ethology, and the development of more or less elaborate intonational “signifiés” from instinctive “signifiants” or signs that originally expressed uncontrolled primary emotion.”
Of course, Vaissière refers here to the iconicity of intonation. So does Bolinger (1983, p. 99) when he says “Intonation is iconic and grammatical uses are consonant with that iconism”. However, bearing in mind that intonation, as was previously explained, is a combination of pitch, loudness and tempo, we can infer that it is possible to speak of a prosodic iconicity code. And where does this then fit in with Embodiment theories? If we take into account the previous idea that our body, by means of image schemata guides our understanding and processing of information around us, and we saw how this applies to voice too, then the iconicity found for example between the feeling of tension and its vocal expression by a tightening of the vocal folds is motivated too by a sensorimotor-experience. We can therefore claim that the prosodic iconicity code is only iconic precisely because it is embodied. Iconicity and Embodiment go hand in hand. However, the iconic relationship is not always present in prosody. As Vaissière states (2005, p. 252): “Motivation dominates in the expression of emotion, while the expression of attitude is more conventionalized.”
This could be a reason why when hearing a foreign language, emotions are recognised more easily than attitudes. Some universals found in the prosodic iconicity code (and more precisely in intonation) for emotions are (Vaissière 2005, p. 251):
For excitement and/or tension: tension of vocal folds For passive emotions or detachment: low F0 and slow rate For agreeable emotions: melodicity
For disagreeable emotions: lack of melodicity
Similarly, Van Leeuwen (1997, p. 103) refers to intonation as “pitch movement” and states the following correlations with emotions:
72 Falling pitch can relax and soothe listeners, make them turn inwards and focus on their thoughts and feelings.
Wide pitch range expresses excitement, surprise and anger. Narrow pitch range expresses boredom and misery.
The common thread for the iconic vocal expression of emotions would be that pitch range seems to be proportional to involvement. As Van Leeuwen comments (1997, p. 111): “The more the pitch increases, the more there is room for the expression of feelings and attitudes, the more it decreases, the more the expression of feelings and attitudes will be confined.”
Pitch can be thus at times an icon of our emotions, since there is a relation of similarity between the range of pitch and the involvement. This can be at the same time interpreted with the attitudinal dimension of activity:
+
low activity high activitynarrow pitch range wide pitch range lack of involvement involvement
Figure
Figure Figure
Figure 7777: Pitch iconicity for involvement: Pitch iconicity for involvement: Pitch iconicity for involvement : Pitch iconicity for involvement
The figure above refers to what Van Leuwen (1999, p. 111) calls a question of “activation”:
The more the pitch rises, the more active and interactive the participants
involved in its production and perception will be. The more the pitch falls, the
more the participants will be deactivated, brought into some state of non
activity, relaxation, contemplation.
The use we make of intonation is not only an iconic code for the attitudes or emotions of the speaker but also iconic for the involvement expected from the speaker.
Therefore it is possible to recognise emotions across languages by using a common iconicity code. However, Bolinger (1983, p. 100) states that the iconicity of
73 intonation is not always implemented in the same way across languages. We will come to this for the interpretation of emotions and attitudes in my perception test.
According to Peirce’s definition, it is easy to find a link between icons and indices: if an index is characterised by a cause-and-effect relation with the object this often implies a resemblance with its object. As explained in Pharies (1985, p. 40) if wet streets are an indexical sign of rain because they are affected by the rain, at the same time we can say that wetness is an icon because it resembles the rain by its wet quality. It is precisely because we find a resemblance between what rain is and the wetness in the street that we can come to the conclusion that it has rained. Just the same happens with voice: in many cases where we speak of an indexical quality of voice features, we find an iconic basis between the sign and the object denoted.
We already saw in 2.2. that Abercrombie retook Peirce’s term of index to apply it to voice, especially to refer to those aspects of voice that reveal personal characteristics. Bearing in mind the direct cause-and-effect relationship existing between age or sex for example and pitch, it is normal to say that pitch, when considered as a voice quality element (permanent) is an index of age and gender, which are some of the essential characteristics of our identity. Furthermore, indices in voice can also be found in the more temporal features, in voice dynamics aspects. Let’s imagine a day when we are stressed because we have too many things to do. There are many chances that when we speak to somebody during a moment of stress we increase the speech rate in a conscious or even unconscious attempt to save time and thus the listener will probably interpret this temporary increased speech rate as an index of our stress. With the previously mentioned example of tension being transmitted by a tightening of vocal folds we also find an indexical relationship: on the one hand, this can be regarded as an icon because the tension heard coming from the tightening of the vocal folds resembles the bodily feeling of tension (by contraction of our muscles) but on the other hand, we can say there is a cause-and-effect relationship between feeling tense and tensing the vocal folds more than what is necessary.
Therefore we see that although linguistic signs are generally symbolic, at the paralinguistic level of language there are more indexical and iconic relationships than purely arbitrary symbolic ones. We can therefore claim that there is a strong possibility that prosodic features based on iconic and indexical relationships will be universally interpreted, whereas symbolic uses of vocal features, being more arbitrary by nature, will be more language-dependant. This can perhaps also shed some light as to why the
74 perception of basic emotions is universal whereas the perception of many attitudes remains more culture-dependant.
Another way in which semiotics can help explain how information of our identity is transmitted is the one exposed by Wilson and Wharton (2006) in their interpretation of how Sperber and Wilson’s (1986) Relevance Theory (RT) can be applied to prosody. Although RT in its foundation, is based on a modular conception of language with respect to cognition (going against the holistic approach to language that CL and my dissertation supports) there are aspects that are compatible with CL. And these are precisely the ones retaken for my own objectives. RT argues that we constantly react to encoded messages by interpreting only the information that is relevant to that particular message in a given context and tha we tend to encode relevant information in our messages, that is, the most new information requiring less effort. This in a way in an echo of the teleological action described by Habermas (1987, p. 126).49
Just as a reminder, the founders of RT (2006, p. 1565) claim that: “Relevance is characterised in cost–benefit terms, as a property of inputs to cognitive processes, the benefits being positive cognitive effects, and the cost the processing effort needed to achieve these effects.”
In other words, the lower the cost and the higher the benefits are, the more relevant we say the input is. According to them, this theory then has implications in the information transmitted by prosody. As was already mentioned in 2.2., prosody is said to accomplish two main functions, a linguistic one and a paralinguistic one, called “natural” by Wilson and Wharton. These two functions are then studied by two different approaches, the grammatical one for the former function and the attitudinal one for the latter. What is interesting for my study is their semiotic interpretation of prosody: they claim there are mainly three kinds of prosodic inputs: linguistic inputs, natural signals and natural signs. Wilson and Wharton distinguish the two last options as follows (2006, p. 1561): “Natural signals, like linguistic signals, are genuinely coded and inherently communicative; natural signs, by contrast, are interpreted by inference rather than decoding, and are not inherently communicative at all.”
A prosodic natural signal would be for example the use of an affective tone of voice to communicate an emotion, whereas a prosodic natural sign would be any
49 Habermas (1987, p. 126): “ Der Aktor verwirklicht einen Zweck bzw. Bewirkt das Eintreten eines erwünschten
Zustandes, indem er die in der gegebenen Situation erfolgsversprechenden Mittel wählt und in geeigneter Weise anwendet.”
75 prosodic feature that for example gives details about the mental health of the speaker. The former type codes the information by meaning something, whereas the latter conveys the information by showing it. Both types of prosodic input can convey information intentionally (being thus communicative according to Lyons, 1977) or unintentionally (being thus informative because it conveys meaningful new information for the hearer although it was not the speaker’s intention). We can perhaps relate Wilson’s and Wharton’s terminology with Abercrombie’s phonetic distinction: voice quality aspects (physically conditioned) can be associated with natural signs, since our general tension, phonation type, nasality, average pitch and average intensity are not controlled elements of our speech and do not exist to communicate, but rather, give information about our identity by inference. Voice dynamics features can be associated with natural signs since they are learnable and are used in different ways according to the context and to what exactly we want to communicate. The following figure is an adaptation of one of Wilson’s and Wharton’s schemas (2006, p. 1563) and sums up their approach:
Prosodic inputs
Natural Linguistic signals: Voice dynamics
Signs: Signals:
Voice quality Voice dynamics
showing meaning Figure Figure Figure
Figure 8888: Types of prosodic input: Types of prosodic input: Types of prosodic input : Types of prosodic input
We have therefore seen how the study of voice can benefit from semiotic approaches, with Peircean terminology on the one hand and the findings from the
convey information uninentionally: informative intentionally: communicative
76 relevance theory applied to prosody of Wilson and Wharton (2006) on the other hand. Last but not least, it is also possible to retake Saussure’s (1955) concepts of significant/signifié and apply it to voice: the vocal features or voice markers can be considered as signifiants and the different functions or kinds of information conveyed by the voice markers as the signifiés. What we find then is a very complex situation, because as stated by van Leuwen (1999, p. 129): “A same signifier can be used at different levels. It may realize a speech act, a sound act, constitute a habitual or prescribe characteristic of an individual or group.”
An example for this phenomenon can be found in almost every vocal feature. Let’s take high pitch, for example: the high pitch is an indicator (amongst many possibilities) of young age, female gender and may be used as a politeness strategy to indicate diminution. So here we have one significant for three different signifiés (amongst many others not mentioned). Vice-versa the opposite happens constantly: for one signifié, for example, to express dominance, we have different vocal strategies or signifiants such as low pitch and high intensity. What is also complex, as already mentioned, is the fact that with voice there is never a one-to-one correlation, that is, for a signifié, several signifiants are used at the same time, which recalls the idea mentioned before that equivocal speech markers are more frequent than unique speech markers.