Cadenas Productivas: - Fomento de la competitividad:

Capítulo 3: ¿Qué falta hacer?: Evidencia empírica y propuestas para una mayor generación

3.2 Fomento de la competitividad:

3.2.2. Cadenas Productivas:

−40 −30 −20 −10 0 10 20 30 40 2 4 6 8 10 12 14 Elevation (deg) Frequency (kHz) −30 −25 −20 −15 −10 −5 0 5 10 15 20 Gain (dB)

(b) Original plot. (c) Synthetic plot.

(d) Pinna. −40 −30 −20 −10 0 10 20 30 40 2 4 6 8 10 12 14 Elevation (deg) Frequency (kHz) −30 −25 −20 −15 −10 −5 0 5 10 15 20 Gain (dB)

(e) Original plot. (f) Synthetic plot.

Figure 4.18: (a) Contour extraction on Subject 048’s pinna. (b) Original and (c) synthetic HRTF magnitude,∣Ha∣, plots for Subject 048. (d) Contour extraction on Subject 020’s pinna. (e)

Original and (f) synthetic HRTF magnitude,∣Ha∣, plots for Subject 020.

the measured patterns, attesting fitness of the contour extraction and mapping procedure;

3. Gains, even in the intermediate frequency areas between notches and resonances, are overall preserved.

On a closer inspection, it can be noted that Subject 020 originally exhibits a wide dip around φ = 40○ _{in the highest frequency range which is not correctly reproduced; this may be due}

to the superposition of two or more notches that cannot be detected when tracing the pinna contours. As for Subject 048, comparison of his pinna picture with the original HRTF plots suggests a relationship between the shorter antihelix and concha wall reflection surfaces and two distinct notch tracks, the first located around 8 kHz at negative elevation and the second around 10 kHz at positive elevation. Since three contours are modeled, these two notches are collapsed in one continuous track, see Fig. 4.18(f). A further notch appears around 15 kHz, yet it is likely associated with a mild pinna contour.

4.4.5 Discussion

As a conclusion to the presented results, if one assumes that the aforementioned mismatches are in most cases not perceptually relevant, one can then consider the mean SD of 4 dB in H_tots as a satisfactory result, being comparable to SD values found in similar works that deal with HRTF resynthesis by means of HRIR decomposition (Faller II et al., 2010) or anthropometric parametrization through multiple regression analysis on HRTF decomposition (Nishino et al., 2007). Furthermore, the proposed model is composed of first- and second-order filters only: given that many responses exhibit sharp notches whose shape cannot be reproduced by a second- order filter, increasing the order of notch filters in particular would further improve the SD score. However, low-order filters allow cheap and fast real-time simulation, which is a valuable merit of the model.

In synthesized models of the Subject 020 and 048, the second resonance is clearly overesti- mated and its shape does not find a strong visual correspondence with its original counterpart. Such mismatch highlights a complex spectrum evolution due the presence of two or more resonances interacting in the higher frequency range for elevations in proximity of the horizontal plane Shaw (1997). However, following the choice of limiting the number of resonances to two, and assuming the first resonance to be omnipresent, the second synthetic resonance has to cover multiple contributions.

Further analysis is required toward a detailed model that takes into account the individual differences among subjects and their psychoacoustical relevance besides the observed objective dissimilarities. Synthetic notches bear a smoother magnitude and bandwidth evolution compared to the original ones; in particular, magnitude irregolarities in the original notches could arise from superposition of multiple reflections and, in addition, from a strong sensitivity of the subject’s spatial position during the HRTF recording session. Furthermore, the CIPIC HRTF database used in this study does not include elevation data below −45○. Alternative HRTF data sets or BEM simulations should be considered in order to extend the ray tracing procedure to the range −90○_<_{φ <}−45○_.

Psychoacoustical evaluations in the context of virtual environments are needed to assess the effectiveness of this approach in improving user’s sense of presence and immersion, together with perceptive relevance of using such homogeneous notch and peak shapes.

4.5 Future perspectives

In this chapter, a mixed structural approach for estimating, modeling and selecting the pinna pHRTF was presented. An algorithm that separates the resonant and reflective parts of the PRTF spectrum was firstly implemented and then such decomposition was used to resynthesize the original PRTF through a low-order filter model. Results showed an overall suitable approximation to the original PRTFs.

Ongoing and future work in order to extend the structural decomposition algorithm includes: • improvements in the analysis algorithm: in particular through the use of a better multi-

Chapter 4. Mixed structural modeling approach: the case of the pinna 97

• enhance the tracking of frequency notches through the McAulay-Quatieri partial tracking algorithm, in order to obtain a robust and continuous representation of frequency notches along elevation;

• performing regression of PRTF data over anthropometrical measurements towards func- tional representation of resonances and notches.

An analysis of real HRTF data in order to study the relation between HRTF features and anthropometry in the frontal median plane supports the hypothesis that reflections occurring on pinna surfaces can be reduced for the sake of design to three main contributions, each carrying a negative reflection coefficient. Based on this observation an approach to HRTF customization, mainly based on structural modeling of the pinna contribution, was proposed. Spectral distortion and notch frequency mismatch measures indicate that this approximation is objectively satisfactory.

The pinna model as it was integrated in the structural model of Sec. 3.3.2 represents a no- table extension of the one in (Satarzadeh et al., 2007) as it includes a large portion of the frontal hemispace, and could thus be suitable for real-time control of virtual sources in a number of applications involving frontal auditory displays, such as a sonified screen (Walker and Brewster, 2000). Further extensions of the model, such as to include source positions behind, above, and below the listener and also in sagittal planes, may be obtained in different ways.

Furthermore, the exploitation of the pinna reflection model for HRTF selection is promis- ing and the reported experiment confirms these expectations. Compared to the use of a generic HRTF with average antropometric data, the pinna reflection approach increases the average elevation performances of 17%, significantly enhancing both the externalization and the up/down confusion rates. The selection criterion assigning the whole weight to contour C1 gives the best

results. Indeed, pinna contours may have different weights and could play different roles in the selection. As future work, one can exploit the three contours in a tuning process: while C1 will

be used to prune the candidate HRTF sets, the remaining contours will select the “best” HRTF set among the remaining.

Subjective evaluations that take into account both structural model and selection criteria will allow to understand the influence of notch gain and bandwidth in elevation perception as well as the relation between the resonant component of the PRTF and the shape of pinna cavities. All these information are essential requirements in order to have a complete anthropometric parametrization of the pinna model. It will be necessary to perform listening tests on subjects for which individual recorded HRTFs are available, in order to have a “ground-thruth” for the evaluation of structural models obtained with the MSM approach.

Finally, it is worthwhile to mention that the the listening setup comes closely to a feasible scenario for practical applications (e.g. no individual HRTFs for comparison, non-individual headphone compensation); in light of this, the next chapter presents a tool that automatically extracts pinna contours from a set of 2D images (Spagnol et al., 2013c). An extension of the reflection model to three dimensions, e.g. applied to 3D meshes of human pinna, would greatly improve the accuracy of the extraction, modeling and selection processes, provided that handi- ness of the system is not reduced too drastically.

Chapter 5

In document D OCUMENTO DE D ISCUSIÓN (página 116-134)