RECOPILACION DOCUMENTAL
6.6. DISEÑO DE RECOLECCIÓN, PROCESAMIENTO Y ANÁLISIS DE DATOS: DE DATOS:
At this point you should be asking yourself — what if I use a sampling rate that is lower than twice the maximum frequency component of the audio signal. That is, when fs/2 < fmax? The answer to that is that you will
experience a particular digital artifact called aliasing. The resulting digital artifact is quite interesting and presents itself as a distorted frequency component not part of the original signal. This artifact caused by under- sampling may sometimes be intriguing and perhaps useful in the context of musical composition, but when the objective is to acquire an accurate representation of the analog audio signal in the digital domain via sampling, this will be an undesirable artifact. We will see in more detail how aliasing occurs in Chap. 8 after learning about the frequency-domain and the Fourier transforms. For now, let’s try to grasp the idea from a more intuitive time- domain approach.
Let’s consider a 1 Hz sine wave sampled at fs = 100 Hz (100 samples
per second) as shown in Fig. 4.5. We know that a 1 Hz sine wave by itself sampled at fs = 100 Hz meets the sampling theorem criteria, the
Nyquist limit being at 50 Hz — any sine wave that is below 50 Hz can be unambiguously represented in the digital domain. So there should be no aliasing artifacts in this particular case (f = 1 Hz). Looking at the plot we clearly see that a 1 Hz sine wave makes one full cycle in 1 second and the number of samples representing this one 1 Hz sine wave is 100 samples with a period T = 1/100 sec.
Let’s now increase the frequency of the sine wave to 5 Hz and keep everything else unaltered — the results are shown in Fig. 4.6. Once again we note that T and the number of samples stay unchanged at 1/100th of a second and 100 samples, but since it is a 5 Hz sine wave we will get 5 full cycles in one second rather than just one. Note also that previously we had the full 100 samples to represent one entire cycle of the 1 Hz sine wave and now we only have 100/5 = 20 samples to represent one full 5 Hz cycle.
Fig. 4.5. Sine wave withf = 1 Hz and fs= 100 Hz.
September 25, 2009 13:32 spi-b673 9in x 6in b673-ch01
22 Introduction to Digital Signal Processing
Fig. 4.7. Sine wave withf = 20 Hz and fs= 100 Hz.
Let’s further increase the frequency to 20 Hz. The results are as expected as shown in Fig. 4.7. We again observe that one full cycle of the 20 Hz sine wave is further being deprived of samples to represent one full cycle — we only have a mere 5 samples for one full cycle for the 20 Hz sine tone. Although it still kind of looks like a sine wave and certainly sounds like a sine wave (try it with the MATLABR code by changing the sampling rate and frequency) it is quickly loosing its characteristic sine tone appearance. Now let’s push the envelope a bit further and bring the frequency up to the edge of the Nyquist frequency limit of 50 Hz (in actuality the sine tone is not as pretty as seen in Fig. 4.8 but for the sake of argument let’s pretend that it is). Note that in Fig. 4.8, the sine tone that is just below 50 Hz is has been reduced to a bleak sequence of plus 1.0 and minus 1.0 samples, each pair of change in polarity representing a full cycle of this almost 50 Hz sine tone. I suppose that one could look at one cycle and be really generous and say — I guess it still possesses the bare minimum features of a sine wave (again this is not the way it really works but let’s pretend for just a bit longer and see what the voil`a is at the end).
We have at this point pretty much stretched our ability to represent a sine wave with a sampling rate of 100 Hz and note that a sine wave with any higher frequency would be impossible to represent. At the edge 50 Hz we
Fig. 4.8. Sine wave withf “almost” 50 Hz and fs= 100 Hz.
were using two samples to mimic a sine wave but any frequency higher than that would mean one sample or less (if we follow the trend of decreasing number of samples with increase in frequency) per one whole sine cycle. It would not be possible to describe a sine wave with just one sample (and certainly not with 0 samples!) as a sine wave should at least have the characteristic of two oppositely peaking amplitudes. However, if we had a higher sampling rate of say 200 Hz we would be able to represent a 50 Hz or 60 Hz sine wave quite easily as we would have more samples per second to play with. This is not the case with a 100 Hz sampling rate where we’ve now come to a dead end. This is the pivot point where aliasing occurs. To illustrate the artifacts of aliasing let’s go beyond the Nyquist limit starting with 50 Hz and see what the resulting plot actually looks like. This is shown in Fig. 4.9.
What just happened? In Fig. 4.9 we note that the 50 Hz sine is exactly the same as a 0 Hz sine, the 95 Hz sine (Fig. 4.10) is an inverted version (or phase shifted by 180o) of the original 5 Hz sine, and the 105 sine (Fig. 4.11)
is exactly the same as the 5 Hz sine (no inversion). If we disregard the inversion aspect of the sine waves in the figures, frequencies that are above the Nyquist frequency are literally aliased towards a lower frequency sine wave — this artifact that results from a process called under-sampling
September 25, 2009 13:32 spi-b673 9in x 6in b673-ch01
24 Introduction to Digital Signal Processing
Fig. 4.9. Sine wave atf = 50 Hz and fs= 100 Hz.
is referred to as aliasing. The 95 Hz has aliased back to the 5 Hz sine (with inversion of 180◦ or π) as has the 105 Hz sine (without inversion). An interesting characteristic of our hearing system is that we do not hear phase differences when listening to a single sine wave and hence the 5 Hz and the 95 Hz and 105 Hz signal actually sound exactly the same to us even though there are phase shift differences. However, if a signal is comprised of a number of sine waves with different phases, the perception of the complex sound due to constructive and deconstructive interference is different (deconstructive and constructive interference is discussed in Chap. 4 Sec. 4). Try changing the frequency in MATLABR to convince yourself that higher frequencies that go beyond the Nyquist limit exhibit this aliasing phenomenon to a lower frequency. The above explanation of aliasing in terms of the decreasing number of samples per period is one way to intuitively look at aliasing until we develop more powerful tools in Chaps. 6 and 8. Figure 4.12 illustrates this idea of aliasing.
Fig. 4.11. Sine wave atf = 105 Hz and fs= 100 Hz.
September 25, 2009 13:32 spi-b673 9in x 6in b673-ch01
26 Introduction to Digital Signal Processing
To recap, the frequencies that are beyond the Nyquist limit will always alias downwards to a lower frequency. What makes things very interesting is that musical signals are rarely just made up of a single sine tone, but are rather made up of an infinite number of sine tones. One can imagine how that may contribute to the net cumulative aliasing result when using inappropriate sampling rates.
5 Quantization and Pulse Code Modulation (PCM)
Until now we have pretty much concentrated on the frequency component of sound and not much attention has been given to the issues pertaining to what happens to the amplitude values of each sample during the analog to digital conversion process. Here too, the topic of concern is the levels of inferiority in representing the original analog counterpart in the digital domain. The quality, resolution, and accuracy of the amplitude of sampled analog signals are determined by the bit depth impacting the quantization error. By quantization error I mean the error (ε) between the discrete digitized and analog amplitude values as seen in Eq. (5.1) where n is the sample index.
ε = x(t)|t=nT− x[n · T ] (5.1)
As previously mentioned, audio CD specifications include two channels of audio sampled at 44.1 kHz for each channel and are quantized at 16 bits equivalent to 65,536 possible discrete amplitude values. For example, if the minimum and maximum amplitude values are normalized to−1.0 and +1.0 a 16 bit system would mean that the range between ±1.0 would be divided into discrete 65,536 units. On the other hand, if values fall between, say 0 and 1, like 0.4, for an integer system with 65,536 integer points either a rounding (adding 0.5 and truncating the mantissa), “floor”ing (truncating the mantissa) or “ceil”ing (using next integer value only) method is commonly used in the quantization process. The method of using equally spaced amplitude step sizes is referred to as uniform quantization. Quantization of the amplitude values and using a specific sample rate to store or record data is referred to as PCM (pulse code modulation). It is somewhat a confusing term as there are really no pulses in a PCM system per se, except perhaps when analyzing the encoded binary structure of the digitized amplitude values. For example 2,175 in binary 16 bit format is 0000 1000 1000 1001 as illustrated in Fig. 5.1.
In Fig. 5.2 we can see the discrete amplitude/time sine wave representing an inferior version of the original sine wave. Note that the
greater the bit resolution for a digital system, the smaller the amplitude grid division for the y-axis much like the sampling rate which decreases the time grid (T ) as fs is increased. This is depicted in Fig. 5.3 where
the sampling interval is kept constant while the bit resolution is decreased
Fig. 5.1. PCM: integer value 2,175 in 16 bit word binary format.
Fig. 5.2. Quantization of amplitude values.
Fig. 5.3. Bit resolution and effect on quantization. Better quantization resolution (left), poorer quantization resolution (right).
September 25, 2009 13:32 spi-b673 9in x 6in b673-ch01
28 Introduction to Digital Signal Processing
causing the quantization error to increase — the left plot has a higher quantization resolution and right plot a lower one limiting the number of points that can be represented on the amplitude y-axis.