TRES TIPUS D’ANTICAUSATIUS
3.1 Classificacions fetes
3.1.2 Labelle (1992) per al francès
A modified version of the the SoundScape Renderer (Geier et al., 2008) was implemented and verified in cooperation with BBC R&D. This system builds upon the low-latency and highly configurable SSR platform which performs block-wise FFT convolution for dynamic binaural synthesis. BBC R&D developments were made to also include variable early-to-late binaural mixing times which help to improve the efficiency of the renderer.
4.5.1
BRIR Perceptual Mixing Time
Because the SBSBRIR dataset contains long, independent BRIRs for each of the measured head-azimuths, realtime rendering of many loudspeakers and listening
positions can become expensive. Lindau et al. (2012) has shown that after a specified time following the BRIR onset, a cross-fade to a single BRIR tail can be perceptually equivalent and has large benefits for computation and computer memory usage. An example BRIR for the left ear, head-azimuth 90◦, loudspeaker 0◦ and listening position (X = 0.5, Y = 0.5m) is shown in Figure 4.6 with the half cosine windows illustrated and the early and late regions plotted with different line styles. The amplitude shows 20log10|BRIR|. When the listener moves their
head, only the initial region of the BRIRs change dynamically whereas the late part is loaded into memory once and reused. The late region is still maintained as a binaural signal and therefore has natural decorrelation between left and right ears. It can be seen that the direct region of the BRIR and early reflections are maintained dynamically but as the reverberation becomes more diffuse, less accuracy is needed. Time (ms) 20 40 60 80 100 120 140 160 20log 10 |BRIR| (dB) -50 -40 -30 -20 -10 0 10
90° BRIR (left) 0° BRIR (left)
Figure 4.6: Example of BRIR split into static and dynamic regions. This is an example BRIR for the left ear where the initial region is dynamic and shows the BRIR where θ = 90◦. The static region is shown in the dashed line and is taken from theθ = 0◦ head azimuth. The y-axis is magnitude on a log scale to emphasise the early reflections in the dynamic region. Mixing time is 50 ms after the BRIR start.
The effect is more easily described by plotting resultant summated BRIR regions across both time and head-azimuth changes. Figure 4.7a and b show the functions
without and with mixing respectively.
(a) No mixing (b) 50 ms mixing time
Figure 4.7: 20log10|BRIR| plotted against time and head-azimuth with and
without dynamic mixing. (a) shows the raw BRIRs without any dynamic mixing and (b) shows the same BRIRs but with static tails from 0◦ BRIR used after 50 ms cross-over window.
A perceptual mixing time of 50 ms is supported by results of mixing times for small rooms and relevant model-based predictors in work by Lindau et al. (2012). This is due to smaller rooms having less distance between discreet reflections and therefore impulse responses take less time to achieve a ‘diffuse’ state.
4.5.2
Approximating Anechoic Simulations
For certain experiments within the project it was necessary to have anechoic versions of the SBSBRIR dataset i.e the same artificial head, loudspeakers and listening positions located in an anechoic environment. This would allow for the investigation of the importance of room reverberation on perception across the listening area and also serve as a learning dataset for localisation modelling presented in Chapter 7. A method to achieve anechoic versions of the SBSBRIR dataset was implemented which still maintained artefacts caused by changing
listening position. Two-band windowing was performed using onset detection to isolate only the direct part of the BRIR. This region includes the head, pinna and torso reflections from the full BRIR as well as maintaining loudspeaker effects. Truncating early regions of BRIRs have been applied for situations of comparing real and synthesised reverberant tails (Menzer and Faller, 2009). Frequency-dependent windowing has been applied to impulse responses of acoustics systems (Karjalainen and Pautero, 2001). A second order Linkwitz-Riley filter (LR2, LR-2), as shown in Appendix. A, was implemented to separate high- and low-frequency regions on an input BRIR. Each frequency band was then windowed independently, using a longer window for low-frequency components to avoid truncating the loudspeaker low-frequency response. The filter had a cross-over frequency of 400 Hz. Windowing was performed using onset detection to ensure that interaural and inter channel delays were not affected by the windowing. Following the detected direct-path onset time, windowing started after 45 samples3 for the high-frequency region and
230 samples for the low-frequency region. The onset of the first room reflections varied slightly depending on the loudspeaker and listening position but visual inspection of the impulse response indicated they started around 150 samples after the onset of the direct path. The window lengths were optimised by iteratively lengthening the times for each frequency band until the optimal trade-off was found between maximising the length of the direct path and attenuating the first reflection. Figure 4.8 shows the magnitude response of an original BRIR against the anechoic version.
Frequency (Hz) 100 250 500 1000 2000 4000 8000 16000 Magnitude (dB) -80 -70 -60 -50 -40 -30 -20 -10 0 LS: 0° Head Az.: 45° Ear: Left LP: x0y0 LR2 Cross-over BRIR Anechoic Frequency (Hz) 100 250 500 1000 2000 4000 8000 16000 Magnitude (dB) -80 -70 -60 -50 -40 -30 -20 -10 0 LS: 0° Head Az.: 45° Ear: Right LP: x0y0 LR2 Cross-over BRIR Anechoic
Figure 4.8: Power spectrum analysis of the reverberant and approximated anechoic BRIRs using dual-band windowing. Results are for L and R ears for the BRIR head azimuth of 45◦, loudspeaker 0◦ at the central listening position.