• No se han encontrado resultados

the research institute CNET from France.

After a careful study of these systems and a number of tests and experiments, the MPEG group combined tools and solutions from the two systems to create three distinct levels or layers, optimized for different areas of application.

MPEG Layer 1 is a sort of simplified MUSICAM compression method deliver- ing mild compression ratios (4:1) at low cost. It accepts as its input all three sampling rates (32, 44.1, and 48 kHz) and processes the 20 kHz stereo signals in 32 subbands using a limited number of compression tools.

The MPEG Layer 2 audio coding method is practically identical to the MUSICAM compression method and was originally developed for DAB (Digital Audio Broadcasting—a new standard allowing digital radio broadcasting over

7.4 Audio Compression Methods and Standards 107

existing FM channels). Later on, the same MPEG Layer 2 was adopted in the framework of the DVB standard for the compression of television sound signals. MPEG Layer 2 also accepts all three sampling rates and processes audio signals in 32 subbands, but it uses more complex compression tools to achieve higher compression ratios.

MPEG Layer 2 is used for the encoding of five channels of a surround-sound system (right, left, center, left surround, and right surround) in a single backward- compatible stereo mix. Namely, one bitstream is coded as a backward-compatible signal carrying all necessary information for the representation of the stereo downmix of a five-channel system. The same bitstream will also contain some additional information, which the simple stereo decoders will not interpret but will simply reject. However, that additional information, called multichannel exten-

sion, is aimed at special decoders that can interpret it as the necessary data for the

reconstruction of a full multichannel sound image from a stereo downmix. The main fields of application of the MPEG Layer 2 coding method are • audio source coding for DAB (Digital Audio Broadcasting—digital radio

developed in Europe by a consortium of broadcasters and manufacturers) • coding method for the exchange of program material in broadcasting

operations

• audio source coding for DVB (Digital Video Broadcasting)

• audio coding method for DVDs aimed at the European market (although it seems that the AC 3 compression scheme is becoming dominant in the worldwide DVD market)

MPEG Layer 3, probably the most widely known under its “MP3’‘ name and for its controversial application for downloading music from the Internet, represents a complex combination of MUSICAM and ASPEC compression tools and allows the achievement of the highest compression ratios. It was specifically developed for all applications where the bandwidth is the most critical element.

In the process of defining the three MPEG layers, one of the starting assump- tions was the need to ensure a certain backward compatibility among the three layers. Although interesting in terms of practical applications, such a require- ment restricted the use of all available compression tools and the achievement of high-quality audio reproduction at higher compression ratios. That was the reason that pushed the MPEG audio group to continue its work and develop an advanced coding system—the AAC (Advanced Audio Coding) that is not compatible with the previously described layers but offers superior quality at lower bit rates. At the same time, the AAC supports not only the compression of mono, stereo, and surround-sound systems but also of multichannel ones up to 48 channels. All compression tools, including those used in layers 1–3 and the newly

developed ones, are grouped as modules in three profiles of differing complexity and efficiency:

• the main profile that uses all the tools from the AAC repertoire and consequently has the most complex coder

• the low-complexity (LC) profile that has more modest requirements in terms of memory and processing complexity and is composed in such a way that LC-coded signals can be decoded by main-profile decoders

• the scaleable sampling rate profile (SSR) that offers a specific solution based on the splitting of the input audio into four frequency bands and a separate processing of each of these bands (resulting in a separate bitstream). At the decoding side it is possible to select one or more of these bitstreams and obtain reconstructed signals of a reduced bandwidth.

The advanced characteristics of the AAC compression method and its non- compatibility with earlier MPEG coding schemes recommended it for some new applications:

• audio source coding for the latest digital audio broadcasting standard DRM (Digital Radio Mondiale—digital broadcasting standard for AM radio broadcasting)

• audio source coding for ISDB (Integrated Services Digital Broadcasting) standard

• in a somewhat expanded form (featuring some additional tools), the audio compression method selected to accompany MPEG-4 video coding

• possible compression format for a new generation of personal solid-state audio players

The AC 3 coding scheme was developed at the beginning of the 1990s by Dolby Labs as a multichannel coding system for the cinema industry. Since its main aim was to offer a coding method for a multichannel sound display in movie theaters it was conceived without any restriction imposed by the need to be compatible with some other, previously developed method. Although it uses somewhat different tools, or the same tools in a different manner, the AC 3 coding method is, like all MPEG layers, essentially based on the use of perceptual coding. The AC 3 coding method is considered to be slightly more efficient than MPEG Layer 2, and its main areas of application are

• compression format for the theatrical display of surround-sound films • audio source coding for the ATSC (Advanced Television Systems Commit-

tee) broadcasting standard

7.4 Audio Compression Methods and Standards 109

As was the case with compressed video signals, it is difficult to issue deci- sive statements about respective qualities and efficiencies of the described audio compression methods. Such a task is made even more difficult by the fact that the appraisal of the quality offered by different coding schemes is based on subjective assessment methods. Nevertheless a number of assessments were made around the world, and some statistical values could be taken as sufficiently representa- tive. Therefore, these tests, conducted with the necessary rigor and in accordance with internationally recommended methods, showed the following results:

• MPEG Layer 2 compression method offers near-transparent quality at 256 kbps for a stereo pair although, in practice, a 192 kbps bit rate for a stereo pair is frequently used for broadcasting purposes.

• Stereo audio signals compressed in accordance with the AAC coding method are assessed as near transparent for bit rates ranging from 96 to 128 kbps, while a comparable quality for multichannel transmission is achieved at rates ranging from 256 to 320 kbps.

• The same near-transparent quality for stereo signals is achievable by the AC 3 coding scheme at bit rates on the order of 192 kbps, and for multichannel transmissions at 384 to 448 kbps.

• Finally, some credible comparative tests show that for stereophonic signals, the output quality is assessed as equal for MPEG Layer 2, AC 3, and AAC coders at 192, 160, and 96 kbps respectively.

At this point it seems necessary to refer to some specifics of the human auditory system (HAS) since in a way these specifics limit the area of possible applications of audio compression methods. The masking effect is the feature of psychoacous- tic models used in compression systems. However, the results of masking prove to be quite different in the case of mono and stereo reproductions. The effective- ness of masking is based on the presumption that the masking and the masked sound sources are co-sited, that is, share the same physical location. Although it is not the case in real life, in mono reproduction all sounds come from a single source, and the masking effect can be exploited to its full extent. However, in the case of stereo reproduction, the situation is radically different. It is important to recognize that the HAS is capable of discriminating one sound source in the midst of a multitude of other sources and to enhance its subjective audibility. A stereo sound system creates an artificial audio space where the ear can recognize different locations of different sound sources and therefore, if the HAS decides to use the auditory selectivity to pick one or another, the effect of masking will be considerably reduced. At the same time the HAS is capable of separating main source sounds from sound resulting from multiple reflections (reverberation) and constituting what is usually called ambience. In the real world environment—say, when listening to live music in a decent acoustical space—the HAS will accept the

main sources as the “lead” and the ambience will add fullness, thus enhancing the overall effect. However, in the process of bit-rate reduction, the coder will most frequently decide that the major part of the ambience is under the masking level and will thus simply eliminate it, creating a much “drier” sound reproduction.

It is important to stress that the effects described above are not drastic and that they perhaps limit but do not prevent the use of audio compression. Obviously audio compression remains the preferred approach in a number of applications, from digital radio and television transmission systems through some recording applications to the whole span of MP3. However, it explains why in the domain of production of high-quality recording, where the goal is to reproduce as faithfully as possible the real-life listening experience, the use of full EBU/AES digital audio signals is still mandatory. Fortunately, as mentioned earlier, the overall bit rate of such signals is well inside the handling capabilities of standard present-day equipment.

8

Documento similar