• No se han encontrado resultados

Sobre formes femenines i masculines

V. Habilitació gramatical

2. Propostes explicatives anteriors 1 Introducció

3.4. Comentaris addicionals

3.4.2. Sobre formes femenines i masculines

The name Ambisonics has become a general term to describe any audio processing in which a sound field is processed using an intermediate spherical harmonic representation. The first practical use of Ambisonic systems were created by Gerzon (1972) in which 0th and 1st order spherical harmonics were used to decompose and reconstruct a sound field. In the current state, Ambisonics can be used to describe anything from a microphone processing

theory through to virtual sound source panning functions and at the most complex end of the spectrum, sound field synthesis systems using near-field compensated Ambisonics with significantly higher numbers of spherical harmonic coefficients.

In this section, the mathematical formulation will firstly be derived from the wave equation. It will then be shown that using some basic assumptions, the Ambisonic derivation can be reduced to simple amplitude weightings to be used in a practical loudspeaker-based application. Recent developments of these weightings to improve perception will also be introduced.

Derivation

The sound field inside a source-free sphere can be recreated exactly by the control of a continuous source distribution across the surface of a sphere. Firstly, the time- domain homogeneous wave equation can be shown for a linear lossless medium in Equation. 2.3.

(∇2 1

c2

∂2

∂t2)p= 0 (2.3)

where∇2 is the Laplacian operator,pis acoustic pressure andcthe speed of sound

propagation.

From this, the homogeneous Helmholtz equation (Williams, 1999) can be found by applying a Fourier transform to Equation. 2.3.

(∇2+k2)p= 0. (2.4)

The pressure at any point inside a source-free sphere can be found using a Fourier- Bessel decomposition as shown in Equation. 2.5.

p(kr, θ, φ) = +∞ X m=0 imjm(kr) m X n=0 X σ=±1 Bmnσ Ymnσ (θ, φ) (2.5)

θ and φ here represent spherical coordinate angles with r being the distance. jm(kr) is the spherical Bessel functions responsible for the radial term between

the origin and the measurement point. The ‘outgoing’ field can also be described by an additional divergent spherical Hankel function term, not shown in Equation. 2.5. Due to the assumptions of Ambisonic reproduction being free from sources inside the reproduced loudspeaker region, weighting coefficients for Hankel functions are made to equal zero and therefore this term is often removed (Daniel et al., 2003). The real-valued spherical harmonic functions, Yσ

mn(θ, φ) are described by Equation. 2.6. Ymnσ (θ, φ) = s (2m+ 1)n (m−n)! (m+n)!Pmn(sinφ) ×      cos(nθ)if σ = +1 sin(nθ)if σ =−1 (2.6)

Pmn(sinφ) are the associated Legendre functions. n = 1 when n = 0 and n = 2

when n >0 (Nicol, 2010).

Due to the nature of spherical harmonics, it is possible to separate the spatial encoding and decoding processes. Encoding represents the conversion of a physical or theoretical sound field into the spherical harmonic domain for storage or transmission. Decoding represents the conversion back to the spatial domain where the spherical harmonic representation is put back into a format for acoustic reproduction. When encoding and decoding stages are separated, the

normalisation of Yσ

mn(θ, φ) must be carefully maintained (Daniel, 2001, p.156).

For the practical use of higher-order Ambisonics, the Furse-Malham (Malham, 2005) normalisation scheme has become a popular standard. However, alternatives exist that can be extended to arbitrary orders (SN3D, N3D). m is the Ambisonic order where a spherical harmonic representation is truncated at M. For each orderm, there are (2m+ 1) different spherical harmonic functions.

Real-valued spherical harmonic functions for 1st and 2nd orders are shown in Figure. 2.7 and Figure. 2.8 for m= 1 and m= 2 respectively.

(a) Y1,11 (b)Y1

1,0 (c) Y11,1

(a) Y2,21 (b)Y2,11 (c) Y21,1

(d)Y21,2 (e) Y21,2

Figure 2.8: Real-valued spherical harmonic functions for second order (m=2)

Encoding and Decoding

Following the derivation of Ambisonic theory it is important to consider the practical implementation of the technology, a plane-wave source can be encoded in to a spherical harmonic representation Bmnσ defined by,

Bmnσ =S·Ymnσ (θ, φ) (2.7)

This inherently describes how an amplitude S can be represented as a plane-wave from the direction (θ, φ). Equation. 2.5 defines that the summation extends to +∞. Therefore, for a perfect reconstruction of the plane-wave, Bmnσ also extends to ∞, which is practically unrealisable. The spherical harmonic representation of

the sound field is often truncated at an order M, motivated by the spatial resolution needed for the original sound field or other constraints such as number of values for Bσ

mn which are required.

Once the spherical harmonic representation has been achieved, loudspeaker driving signals must be derived. This is the decoding stage and the outcome is dependent on the geometry of the loudspeaker array. Perceptually motivated modifications to achieve loudspeaker signals have also been created (Gerzon, 1992a; Malham, 1992).

The decoding method can be defined using a similar premise to the encoding method, each loudspeaker is considered as a plane wave and the process is to define the plane-wave amplitudes needed to achieve the encoded spherical harmonic signal (therefore the original soundfield) (Hollerweger, 2006).

Bmnσ =

L

X

j=1

Ymnσ (θj, φj)·pj (2.8)

Where the summation occurs for each j loudspeaker at direction θj, φj, using a

total of L loudspeakers. pj is the loudspeaker signal (plane wave amplitude).

defined by the re-encoding matrix, C (Daniel, 2001) where, C =          Y1 00(θ1, φ1) Y001(θ2, φ2) · · · Y001(θj, φj) · · · Y001(θL, φL) Y1 11(θ1, φ1) Y111(θ2, φ2) · · · Y111(θj, φj) · · · Y111(θL, φL) .. . ... . .. ... . .. ... Y1 M0(θ1, φ1) YM10(θ2, φ2) · · · YM10(θj, φj) · · · YM10(θL, φL)          (2.9)

The decoding matrix D is then calculated by finding the inverse of C.

pj =D·Bmnσ (2.10)

If C is not square or does not have full rank i.e. the number of loudspeakers and Ambisonic channels is not equal, the inversion of C is not possible. In this situation a pseudo-inversion algorithm can be used (such as the commonly used, Moore-Penrose pseudo-inverse function). Modifications of the decoding matrix have been defined using a weighted diagonal matrix multiplication where the decoding matrix becomes,

D=D·Γ (2.11)

Daniel (2001) shows basic, max rE and in-phase decoder types, each driving

loudspeakers with different priorities. While basic decoders optimise pressure velocity, max rE decoders optimise reconstruction energy in the direction of the

encoded plane wave. In-phase decoding controls loudspeaker gains to avoid phase differences between loudspeakers of opposing directions. Each decoding ‘flavour’

is chosen depending on the priorities of the reproduction system; Daniel (2001, p. 160) discusses these in detail. In-phase decoding processes were originally proposed by Malham (1992) Ambisonic systems.

Γbasic(m) = (1) (2.12) Γmax rE(m) = cos mπ 2M + 2 (2.13) Γin−phase(m) = M!2 (M +m)!(M −n!)! (2.14) Although tools were developed for this research project to enable the encoding, decoding and processing of Ambisonic panning methods, literature highlighted a general lack of documentation regarding Ambisonic tools used in subjective experiments, ultimately leading to difficult direct comparisons. Therefore, an open-source toolbox1 was used to create Ambisonic decoders for all applications

of Ambisonic panning methods used in this thesis. The authors of this toolbox include decoder settings specifically for the SBSBRIR dataset where subsets of the loudspeakers can be chosen. Although beneficial for 3-dimensional loudspeaker layouts, only loudspeaker arrays along the horizontal plane are considered in this thesis. This is often categorised as 2.5D, where loudspeakers lie in the 2-dimensional plane, but inherently reproduce sound in 3-dimensions.

Virtual Source (Panning) Direction (º)

180 225 270 315 0 45 90 135 180

Gain for Loudspeaker at

θ =0º -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

3rd Order Ambisonics (max

rV)

VBAP

Figure 2.9: Panning functions for VBAP and Ambisonic reproduction methods. Lines show the gain coefficient for the loudspeaker atθ = 0◦as the virtual/phantom sound source is panned across the full azimuth range. The loudspeaker layout is an octagon (0◦, 45◦, 90◦, 135◦, 180◦, 225◦, 270◦, 315◦). Ambisonic panning is, 3rd-order with maxrV decoding philosophy.

To illustrate the practical differences between VBAP and Ambisonic reproduction methods, Figure. 2.9 shows the panning function for a single loudspeaker in an Octagonal loudspeaker layout. It can be noted that whilst VBAP optimises energy in the direction of the phantom source, the Ambisonic panning function often has non-trivial gain coefficients, with opposite phase in opposing directions to the phantom source. Although beneficial to the reproduction system at low-frequencies, at high-frequencies this feature can become problematic.

Documento similar