The Challenging Sonic Aspects of Preventing Aliasing

It is important to filter out any frequencies, including noise, that exist above the Nyquist frequency before sampling. This is accomplished using an analog low-pass filter, which eliminates frequencies above a set point, called the cutoff frequency. This filter, referred to as an anti-aliasing filter, is usually applied around 20 kHz for audio at the standard 44.1 kHz sample rate.

The ideal filter would act as a wall, completely eliminating frequencies above the cutoff frequency and not affecting the frequencies below at all, but no such analog filter exists. Analog filters do not cut off at a straight edge, but rather they gradually roll off. This roll-off is referred to as the quality factor, or ‘Q’. Higher Q filters have steeper roll-offs but are harder to design. Filtering with an infinite quality factor can only be done with digital filters since it is easier to process bits than to manipulate signals with an analog filter. Digital filters that have an infinitesimal transition band are known as brickwall filters.

The roll-off of the anti-aliasing filter can create audible effects depending on its quality factor. Higher frequencies that are still within the audible range may have phase alterations that cause a smearing effect referred to as pre-ringing in which ghost frequencies exist before transients, and they can also be suppressed due to their involvement in the gradual roll-off.

These effects are usually subtle, but they are still an important consideration when it comes to deciding on an optimal sample rate for recording. The difficult aspect of anti-aliasing is designing a filter that will be transparent and work correctly at the same time. The filter must be effective enough at the Nyquist frequency to start pushing the amplitude below the level of hearing, so there is only a narrow space for the filter to roll off if the human hearing range goes all the way up to 20 kHz.

The Process of Oversampling

Oversampling, the process of sampling input at a higher multiple of the target sample rate before subsequent digital conversion, avoids this problem in ADCs by bumping the Nyquist frequency up far from the hearing range and using two filters, an analog filter with a low quality factor and a digital filter. The roll-off of the analog filter can be extended, making it wider and smoother, and the cutoff frequency can be placed high enough to be out of audible range. A digital filter can be cost-effective, and it doesn’t use as much power. It is also more precise than an analog filter and can be used to filter with a very high quality factor.

In addition to using better filtering, this process also reduces the amplitude of quantization noise, which is caused by rounding errors in sample values. The digital signal to noise ratio after digitization is higher than it would be with the standard sample rate because the bandwidth of the noise is widened and the noise is spread out across the entire sampled range. The reduction in noise is proportional to the sample rate during oversampling.

A linear phase digital anti-aliasing filter is applied to the oversampled input to completely eliminate frequencies above the standard Nyquist frequency, and the sample rate is downsampled to the standard 44.1 kHz sample rate for the final product. This filter creates an unnoticeably short delay of all frequencies by the same amount so that phase coherence, the consistency of the phase difference, is not affected by the filter.

Filtering out Images

In the same way that an anti-aliasing filter cuts ultrasonic frequencies in an analog signal, an anti-imaging filter cuts ultrasonic frequencies in a digital signal. Strong high frequencies would be caused by the stairstep waveform drawn out by the sample values if it was not interpolated before playback. This low-pass analog filter, also known as the reconstruction filter, recreates the analog signal by using smooth interpolation as described in the Whittaker-Shannon interpolation formula.

An oversampling process similar to the one described above for analog to digital conversion happens in reverse in most DACs. Just as oversampling allowed a wider anti-aliasing filter, it also permits a wider reconstruction filter. The output is usually sampled at a multiple of 2,4, or 8 of the target sample rate before it is downsampled for playback.

The key aspect of oversampling is that the sample rate of the final input for recording and the final output for playback is not any higher than the standard sample rate. The sample rate is only higher in the transition process when the analog audio is sampled or when the digital audio is being converted back into an analog signal. Recorded material is stored at the standard sample rate. Oversampling provides the benefits of clearer high frequencies and less noise in both input and output. All of the audible frequencies are untouched by both the anti-aliasing and anti-imaging filters, and the quantization noise is lowered.

Oversampling in Digital Plugin Processing

Oversampling can also provide benefits in plugins for digital audio workstation software. Digital plugins that use oversampling process the audio in the same way as discussed above without the analog interpolation step. The oversampling happens internally within the plugins, which sample the input signal at some multiple of its original sample rate, and the audio does not have to be oversampled in the entire DAW.

Some of the upper harmonics created by distortion plugins are above the Nyquist limit, so they are aliased when they are processed at the standard 44.1 kHz sample rate. This aliasing is not harmonically related to the dry signal, so it is noticeable as a sharp, harsh sound. With oversampling, these harmonics are digitally filtered out before downsampling.

Equalizers map an infinite range of frequencies down to the frequency range within the Nyquist limit, so the filtering of frequencies near the top has a warped response. A low-pass filter set to 18 kHz with a 6dB per octave quality factor will completely eliminate frequencies at 22 kHz when it should only be slightly attenuating them. Oversampling fixes this issue of asymmetry by providing a much bigger frequency map that easily covers the human range and much higher into the ultrasonic range.

Sometimes interpolation causes analog audio to go over the digital upper level limit in between samples. Inter-sample peaks created by limiter plugins can be reduced by using oversampling. Limiter plugins that use oversampling are sometimes called True Peak limiters.

The problem of using oversampling is that it doubles the amount of processing that the CPU has to perform for the same amount of plugins. If the usage becomes too high, the latency buffer will need to be increased to avoid clicks and pops. The sound quality depends on the accuracy of the upsampling and downsampling processes in oversampling. Also, if you want all of the plugins to use a higher sample rate, not just the ones that support oversampling, you will have to upsample the entire project. Running the project at a higher sample rate can cause a much higher usage of the CPU, and the accuracy of the upsampling determines whether oversampling is worth it for mixing. Setting your project to use 88.2 or 96 kHz will probably make everything sound better, but it is simply because the plugins are processing at a higher sample rate. This could be one cause of the myth that audio at a higher sample rate sounds better.

High Sample Rate Audio

High sample rate playback above 48 kHz does not seem to have any benefits. Files that are sampled at high sample rates are sometimes referred to as “high-definition audio.” This is different from oversampling because oversampling is an internal process designed to fix certain issues in converter devices and mixing software without changing the final sample rate.

High sample rate playback is based on the idea that hearing ultrasonic frequencies will result in a better sound. Most equipment will not be able to play frequencies above 22 kHz, but equipment that does will playback ultrasonic frequencies if they exist in the high sample rate files, and these create high amounts of intermodulation distortion. Intermodulation distortion is a sideband effect that creates additional frequencies due to amplitude modulation between two other frequencies. This modulation is caused specifically by frequencies that are affected by the non-linear response of the system. Audio systems usually are slightly affected by non-linear response, but not enough for audible intermodulation. Non-linear response is much worse in the higher frequencies that are out of the audible spectrum. If ultrasonic content exists in the audio, it will create audible frequencies that are not harmonically related to the signal, and the modulation will be significantly stronger than the standard amount for audible frequencies. This is an aliasing effect that might not be obvious, but it affects the sound so that is not as accurate as it would be if it were played at a sample rate that is more attuned to our hearing range. Intermodulation distortion is not a characteristic of the original audio, but it is the effect caused when the equipment plays back the ultrasonic frequencies. Our ears cannot hear these frequencies anyways, so the existence of them in the original audio only messes up what we can hear.

Positive audible effects from using higher sample rates might be caused by the widened anti-aliasing filter that is also used in oversampling. It is better to oversample since it returns the audio to the standard sample rate after the oversampling process, but higher sample rates retain the negative effect of intermodulation distortion. One way of determining whether the ultrasonic frequencies make a positive difference is to downsample the audio. In theory, if the ultrasonic frequencies improve the sound, the downsampled audio should sound worse. However, the downsampled audio sounds the same, except for the fact that the audio equipment will not produce the intermodulation distortion.