• No se han encontrado resultados

3 Distributed Architecture Setup

The setup proposed for multimodal emotion identification is composed of a MAS that provides constant monitoring of physical activity, facial expressions, and control of physiologic parameters (see Fig. 3). The experimentation area (testing environment) is equipped with an overhead and a frontal camera. The first camera is mounted to cover the movements of a person, whilst the second camera captures his/her facial expression.

The user is wearing sensors which transmit data in order to detect changes in the main physiologic indicators related to emotions.

3.1 Facial Emotion Recognition and Behaviour Analysis

The human face is the best source that reflects multiple emotions and brings almost the necessary information for affect recognition. As it has been proved in many studies, the visual channel is dominant in affective stimuli perception as well as for emotional state

“reading” through visually detectable indicators (e.g. [15], [16]). Thus, the interpreta- tion of facial emotions and their correct classification accomplishes a significant part of affective status identification [17].

Our proposal stars from the Facial Action Coding System (FACS) which offers a detailed decomposition of facial muscle movements and classifies them into so-called action units [18]. Action units are related to the movements of specific sets of facial muscles that produce the expression. The emotion is recognized though using support vector machines (SVM) to train a classifier. The classifier is trained and tested on the standard databases JAFFE1, MMI2, and Cohn-Kanade3, and show high performance

1http://www.kasrl.org/jaffe.html

2http://mmifacedb.eu/

3http://www.pitt.edu/˜emotion/ck-spread.htm

Architecture for Multimodal Emotion Identification 129 results. The facial emotion detection node executes in the following way. A person is continuously monitored while carrying out his/her daily activity. Micro-movements of facial muscles are captured, evaluated, and classified.

Now, the behaviour is also closely related to emotions as it has been previously re- ported (see, for instance, [19]). Indeed, different emotional patterns provoke arousal or relaxation of specified groups of muscles, as well as changes in velocity, acceleration and trajectory of movements. Depending on the emotion that is being experienced, a person becomes excited or ceases activities, and his/her posture also serves as an indi- cator of his/her mood. Behavior identification and tracking is gotten through constant monitoring. The obtained data are preprocessed with image preprocessing algorithms within the OpenCV library4for real-time computer vision applications.

3.2 Physiological Sensor Data Interpretation

The data set consists of measurements of four physiological signals, which include electro-dermal activity (EDA), superficial electromyogram (EMG), heart rate variability (HRV) and skin temperature (SKT). These markers are chosen due to their suitability reporting the arousal level on the user.

1. EDA is a measure of the skin conductivity, and its value is related with the incre- ment in the sweat glands’ activity, as the skin is better able to conduct electricity.

The sweat secretion is also controlled by the sympathetic nervous system which reacts against any stress situation, pain or mental illness.

2. EMG represents the electrical activation of the muscles, which is controlled by the nervous system. It has been reported that motor activation may be helpful in measuring the emotions, since evidence indicates that voluntary and involuntary movements are modulated by the emotional context [20], as well as the innate dis- position of the individual to move more rapidly in stress contexts, making EMG signals helpful to reveal the excitation degree of the person [21].

3. HRV shows the irregularities produced on the heart, by computing the temporal distance between consecutive heart cycles. Given that heart activity is controlled by the autonomous nervous system, HRV measurements are appropriate to quan- tify the arousal level on the individual. Indeed, HRV has been identified as a direct link over the autonomic nervous system, being affected by the sympathetic ner- vous, such that an increasing HRV can be observed when the person is under stress stimuli.

4. SKT: This parameter is another indicator of arousal degree. Its decreasing and in- creasing has been shown to be related to specific emotions [22]. Indeed, when the user is under stress, body muscles are tensioned, provoking the blood vessels to contract and, consequently, there is a decreasing of the skin temperature.

Data is obtained by the “Wearable Physiological Data Acquisition System” (WP- DAS). In this regard, WPDAS is controlled by an ultra-low-power, 32-bit ARM Cortex- M3 microcontroller (UC). The device architecture is chosen after taking into consider- ation the low power consumptions and its scalability to another more powerful UC’s.

4http://opencv.org/

130 M.V. Sokolova et al.

Fig. 4.Fuzzy inference system for emotion detection

Furthermore, different signal acquisition systems are used to adequate the physiological variables before being sampled by the UC.

3.3 Fusion for Emotion Recognition

Fuzzy logic is one of the artificial intelligence are to solve decision making under highly uncertain conditions. It enables implementation of “human-like” reasoning, which is optimal for multiagent systems such as the distributed architecture proposed in this paper. Thus, a fuzzy-based fusion mechanism is proposed to enhance the overall system performance. Fuzzy logic eases the problem of emotional state identification, on the one hand, and facilitates recognition of emotional patterns, on the other hand. The proposed fuzzy inference system is presented in Fig. 4.

The input variables, which include crisp data from the three independent sources (overhead and frontal cameras and from the WPDAS) are transformed into linguistic variables within the “Fuzzification block”. On the next step, the “Fuzzy inference en- gine” simulates the reasoning process using fuzzy “IF-THEN” rules and data from the

“Knowledge base”. The fuzzy rules are generated based on the expert’s knowledge and using linguistic variables. Lastly, the fuzzy set obtains the crisp values corresponding to the output crisp variable “Emotion”. This variable offers an integer value from 1 to 7, which correspond to one of the basic emotions described by Ekman.

To fuse information which is received on the input of the “Fuzzification block”, for the behaviour and physiological data analysis respectively the following linguistic variables are stated:

“The amount of moving”, which has the termsActive,Reserved, andLow.

“The immobility time”, with the termsHigh,Medium, andLow.

“EDA” with the termsHigh,Medium, andLow.

“EMG” represented with the termsTensedandRelaxed.

“HRV”, which includes the termsRegularandIrregular.

“SKT”, with the termsHigh,Normal, andLow.

The Face Expression Analysis Node outputs a variable which contains information on emotional state, and a probability of the given emotion. These data is also given on the input of the ‘Fuzzification block”.

Architecture for Multimodal Emotion Identification 131