The software-only implementations of the J.L-law and ADPCM algorithms can easily run in real time. A single table lookup can do J.L-law compression or decompression. A software-only implementation of the IMA ADPCM algorithm can process stereo, 44.1-kHz-sampled audio in real time on a 20-MHz 386 -class computer. The challenge l ies i n develop ing a real-time software i mplementation of the 1\'IPEG/audio algorithm. The MPEG standards docu ment does not offer many clues in this respect. There are much more efficient ways to compute the calcu lations required by the encoding and decod ing processes than the p rocedures outl ined by the standard. As an example, the fol lowing sec tion details how the number of multipl ies and add i t ions used in a certain calcul ation can be reduced by a factor of 12.
Figure 9 shows a flow chart for the analysis sub band filter used by the MPEG/audio encoder. Most of the computational load is clue to the second from- last block. This block contains the following matrix m u ltiply:
X i+ 1 ) X (k - 16) X
S(i) = Y(k) X cos
64 k=O
for i = 0 . . . 31.
Using the above equation, each of the 31 values of S(i) requires 63 adds and 64 multipl ies. To optimize this calculation, note that the M(i,k) coefficients are similar to the coefficients used by a 32-point, un-normalized inverse discrete cosine t ransform
(DCT)
given by 31[
(2 X i+ 1) X k XTI]
f(i) =�
F(k) X cos 64 k=O for i= 0 . . . 31 .Indeed,
5(1)
is identical toj
(1) if F(k) is compu ted as fol lowsF(k) = Y(16) for k = 0;
= Y(k + 16) + Y(l6 - k) for k = 1 . . . 16; = Y(k + 16) - Y(80 - k) for k = 17 . . . 31
38
SHIFT IN 32 N E W SAMPLES INTO 512-POINT FIFO BUFFER. X ;
WINDOW SAM PLES:
FOR i = 0 TO 51 1 , DO Z; = C; X X ; PARTIAL CALCULATION 7 FOR i = 0 TO 63, DO yl =
L
z l + 64j 1 =0 CALCULATE 32 SAMPLES BY 63 MATR I X I N G S ; =L
Y; x M ;,k k = OOUTPUT 32 SUBBANO SAM PLES
Figure
9
Flow Diagram oftbe MPEG/Audio Encoder Filter BankThus with the almost negl igible overhead of com pu ting the F(k) values, a twofold reduction in mul t ip lies and additions comes from halving the range that k varies. Another red uction in m u l tiplies and additions of more than sixfold comes from using one of many possible fast algorithms for the compu tation of the inverse DCT .20 2122 There is a similar optimization appl icable to the 64 by 32 matrix m u l t iply fou nd within the decoder's subband filter banJ<.
Many other optimizations are possible for both M PEG/audio encoder and decoder. Such optim iza tions enable a software- only version of the MPEG/ audio Layer I or Layer 11 decoder (written in the C program ming language) to obtain real-time per for mance for the decoding of h igh-fidelity mono phonic audio data on a DECstation 5000 Model 200. This workstation uses a 25-MHz R3000 M IPS CPU and has 128 k ilobytes of external i nstruction and data cache . With this optim i zed software, the M PEG/audio Layer II algorithm requ ires an average of 13.7 seconds of CPU t ime (12.8 seconds of user time and 0.9 seconds of system t ime) to decode 7.47
seconds of a stereo audio signal sampled at 48 kHz with 16 bits per sample.
Although real-time M PEG/au clio decoding of stereo audio is not possible on the DECstation 5000, such decoding is possible on D igital's workstations equ ipped with the 150-MHZ DECchi p 2 1064 CPU (Alpha AXP architecture) and 512 kilobytes of exter nal instruction and data cache. I ndeed, w hen this same code (i.e . , without CPU-specific optimization) is compi led and run on a DEC 3000 AXP Model 500 workstation, the MPEG/audio Layer II algorithm requ i res an average of 4.2 seconds (3.9 seconds of user time and 0.3 seconds of system time) to decode the same 7.47-second audio sequence.
Summary
Techniqu es to compress general d igital audio sig nals i nclude p.-law and adaptive differential pulse code modulation. These s imple approaches apply low-complexity, low-compression, and medium audio qual ity algorithms to audio signals. A third technique, the MPEG/aucl io compression algorithm, is an ISO standard for high-fidelity audio compres sion. The MPEG/audio standard bas three layers of successive complexity for improved compression performance .
References
1 . A . Oppenheim and R . Schafer, Discrete Time Signal Processing (Englewood Cliffs, N) :
Prentice-Hal l , 1989): 80-87
2. K. Pohlman , Principles of Digital Audio
(Indianapolis, JN: Howard W Sams and Co. , 1989).
3. ). Flanagan, Speech A nalysis Synthesis and Perception ( New York: Springer-Verlag, 1972).
4. B . Atal, "Predictive Cod i ng of Speech at Low Rates," IEEE Transactions on Communica tions, vol . COM-30, no. 4 (April 1982).
5.
CCJTF
Recommendation G. 711: Pulse Code Modulation(PCM)
of Voice Frequencies(Geneva: International Telecommunications nion, 1972).
6. L. Rabiner and R . Schafer, Digital Processing of Speech Signals (Englewood Cliffs,
NJ :
Prentice-Hal l , 1978).
Digital Technical journal Vol. 5 No. 2 Spring 1993
Digital Audio Compression
7. M. N ishiguchi , K. Akagiri, and T. Suzuki, "A New Audio Bit Rate Reduction System for the CD-I Format," Preprint 2375, 8Jst Audio Engineering Society Convention, Los Angeles (1986).
8. Y. Takahashi , H . Yazawa, K. Yamamoto, and T. Anazawa, " Study and Evaluation of a New Method of ADPCM Encoding," Preprint 2813,
86th Audio Engineering Society Convention,
Hamburg (1989).
9. C. Grewin and T. Ryden, "Su bjective Assess ments on Low Bit-rate Audio Codecs," Pro ceedings of the Tenth International A udio Engineering Society Conference, London (1991) : 91 -102.
10. ]. Tobias, Fou ndations of Modern A uditory Theory (New York and London: Academic Press, 1970): 159- 202.
1 1 . K. Brandenburg and G. Stoll , "The ISO/MPEG Aud i o Codec: A Generic Standard for Codi ng of H igh Qual ity D igital Aud io," Preprint 3336, 92nd Audio Engineering Society Conven tion, Vienna (1992)
12. K . Brandenburg and ]. Herre, " D igital Audio Compression for Professional Applications,"
Preprint 3330, 92nd A udio Engineering
Society Convention, Vienna (1992).
13. K. Brandenburg and
].
D. Johnston, "Second Generation Perceptual Audio Codi ng: The Hybrid Coder," Preprint 2937, 88th AudioEngineering Society Convention, Montreaux
(1990).
14. K . Brandenburg, ). Herre , ). D. Johnston , Y. Mahieux, a n d E. Schroeder, "ASPEC: Adap t ive Spectral Perceptual Entropy Coding of High Qual i ty Music Signals," Preprint 301 1 , 90th Audio Engineering Society Convention,
Paris (1991 ) .
15. D. H u ffman, "A Method for the Construction of Minimum Redundancy Codes," Proceed ings of the
IRE,
vol. 40 (1962) : 1098-1 101 . 16.)
. D. Johnston, " Estimation of PerceptualEntropy Using Noise Masking Criteria," Pro ceedings of the 1988
IEEE
International Con ference on Acoustics, Speech, and SignalProcessing (1988): 2524 - 2527
Multimedia
17 ). D. Johnston, "Transform Coding of Audio Signals Using Perceptual Noise Criteria,"
IEEE
journal on Selected Areas in Communica tions, vol. 6 (February 1988): 314 - 323.
18. K. Brandenburg, "OCF-A New Coding Algo rithm for High Quality Sound Signals," Proceed ings of the 1987
IEEE ICASSP
(1987): 141 -144. 19. D. Wiese and G. Stol l, "Bitrate Red uction of High Quality Audio Signals by Model ing the Ear's Masking Thresholds," Preprint 2970, 89th Audio Engineering Society Convention, Los Angeles (1990).40
20. ). Ward and B. Stanier, "Fast Discrete Cosine
Transform Algorithm for Systolic Arrays," Elec
tronics Letters, vol. 19, no. 2 Oanuary 198 3).
2 1 . ) . Makhoul, "A Fast Cosine Transform in One and Thro Dimensions,"
IEEE
Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-28, no. 1 (February 1980).22. W-H. Chen, C. H. Smith, and S. Fral ick, "A Fast Comp u tational Algorithm for the Discrete Cosine Transform,"
IEEE
Transactions on Communications, vol. COM-25 no. 9 (Septem ber 1977).jan B. te Kiefte Robert Hasenaar joop W. Mevius Theo H. van Hunnik