Digital Audio Basics: Audio
Sample Rate and Bit Depth
Learn the basics of digital audio and how a computer handles sound, from audio
sample rate to bit depth.
Share:
copied!
I remember how eager I was to get into music production. The arrangement
possibilities were endless, and I could learn how to mix music to sound like what I
heard. Unfortunately, in the chaos of beginning to produce, I didn’t learn the basics
of how a computer actually handles audio, so the whole concept of making music
on a laptop felt a bit abstract.
Even bouncing my first track was confusing. What does each of the options do?
How was I supposed to know what would sound best?
In this article, we’ll cover some basic aspects of digital audio, and how they affect
the production process. Today, we’ll focus on audio sample rate and audio bit
depth, as well as a few topics related to them. It’s a bit of theory and a bit of math,
but hopefully it will peel away some of the mystery behind how digital audio
works.
What is digital audio?
Digital audio is a representation of sound recorded or converted into a digital
signal. During the analog to digital conversion process, amplitudes of an analog
sound wave are captured at a specified sample rate and bit depth and converted into
data a computer software can read.
The main difference between sound and digital audio is that digital audio is a series
of amplitude values used to reconstruct the original analog sound wave whereas as
analog sound is a continuous signal with infinite amplitude values at any one point
in time. Digital audio is like playing connect-the-dots, whereas real sound is the full
original image.
Quantization: audio-to-digital conversion
The analog-to-digital conversion process is called quantization and it's very similar
to the way cameras capture video. A video camera reconstructs a continuous
moment in time by capturing thousands of consecutive images per second, called
frames. The higher the frame rate, the smoother the movie. In digital audio, an
anlog-to-digital converter captures thousands of audio samples per second at a
specified sample rate and bit depth to reconstruct the original signal. The higher the
sample rate and bit depth, the higher the audio resolution.
What is an audio sample rate?
Sample rate is the number of samples per second that are taken of a waveform to
create a discete digital signal. The higher the sample rate, the more snapshots you
capture of the audio signal. The audio sample rate is measured in kilohertz (kHz)
and it determines the range of frequencies captured in digital audio. In most DAWs,
you’ll find an adjustable sample rate in your audio preferences. This controls the
sample rate for audio in your project.
The options you see in the average DAW—44.1 kHz, 48 kHz—may seem a bit
random, but they aren’t!
Sample. rates aren't arbitrary numbers. The computer should be able to recreate
waves with frequencies up to 20 kHz in order to recreate frequencies within the
range of human hearing—humans hear frequencies between 20 Hz and 20 kHz. But
for computers to recreate that, they have to use sample rates double that. So a
sample rate that is 40 kHz should technically do the trick, right?
This is true, but you need a pretty powerful—and at one time, expensive—low-pass
filter to prevent audible aliasing. The sample rate of 44.1 kHz technically allows
for audio at frequencies up to 22.05 kHz to be recorded. By placing the Nyquist
frequency outside of our hearing range, we can use more moderate filters to
eliminate aliasing without much audible effect. Most people lose their ability to
hear upper frequencies over the course of their lives and can only hear frequencies
up to 15 kHz–18 kHz. However, this “20-to-20” rule is still accepted as the
standard range for everything we could hear.
This means we can capture and reconstruct the original sine wave’s frequency with
an audio sample rate at least twice its frequency, a rate called the Nyquist rate.
Conversely, a system can capture and recreate frequencies up to half the audio
sample rate, a limit called the Nyquist frequency.
Signals above the Nyquist frequency are not recorded properly by audio-to-digital
converters (ADCs), becoming mirrored back across the Nyquist frequency and
introducing artificial frequencies in a process called aliasing.
To prevent aliasing, audio-to-digital converters are often preceded by low-pass
filters that eliminate frequencies above the Nyquist frequency before audio reaches
the converter. This will prevent unwanted super-high frequencies in the original
audio from causing aliasing. Early filters could taint the audio, but this problem is
being minimized as better technology is introduced.
Looking to experiment with audio concepts in your DAW?
Get your copy of Music Production Suite or start your free trial of
Music Production Suite Pro to get industry standard mixing and mastering plug-
ins, including Neutron , Ozone , and RX .
Get Your Plug-ins
What sample rate should I record at?
When recording, mixing, and mastering, it's always advantageous to work at the
highest sample rates and bit-depths possible: 48 kHz, 96, kHz, or even 192 kHz.
This allows for greater resolution in all mixing and effects and gives you the
flexibility of bouncing down to a sample rate compatible with your medium of
distribution. However, once it comes to bouncing down your audio, you'll have to
choose a bit depth and sample rate that's compatible with your medium of
distribution.
The standard sample rate for CDs, streaming, and consumer audio is 44.1 kHz,
48kHz is often used in audio for video, and 96 kHz or 192 kHz is used for archival
audio.
44.1 kHz vs. 48 kHz
If you’re recording music, a standard sample rate is 44.1 kHz or 44,100 samples per
second. This is the standard for most consumer audio, used for formats like CDs.
48 kHz is another common audio sample rate used for movies. The higher sample
rate technically leads to more measurements per second and a closer recreation of
the original audio, so 48kHz is often used in audio for video which usually calls for
a big dynamic range.
96 kHz vs. 192 kHz
Given that 192 kHz is taking twice as much samples per second as 96 kHz, it will
require double the amount of hard-drive space to store. While using high sample
rates like 96 kHz and 192 kHz will give you the highest resolution audio, it takes a
lot of processing power and the difference is rarely noticeable to the human ear. For
most musical applications, recording at 48 kHz through a good audio interface will
yield excellent results.
Can you hear the difference between audio sample rates?
Some experienced engineers may be able to hear the differences between sample
rates. However, as filtering and analog/digital conversion technologies improve, it
becomes more difficult to hear these differences.
Is a higher audio sample rate better?
In theory, it’s not a bad idea to work at a higher audio sample rate, like 176.4 kHz
or 192 kHz. The files will be larger, but it can be nice to maximize the sound
quality until the final bounce. In the end, however, the audio will likely be
converted to either 44.1 kHz or 48 kHz. It is mathematically much easier to convert
88.2 to 44.1 and 96 to 48, so it’s best to stay in one format for the whole project.
However, a common practice is to work in 44.1 kHz or 48 kHz.
If the system was set to a sample rate of 48 kHz and we used a 44.1 kHz audio file,
the system would read the samples faster than it should. As a result, the audio
would sound sped up and slightly higher-pitched. The inverse happens if the system
sample rate is on the 44.1 kHz scale and audio files are on the 48 kHz scale; audio
sounds slowed down and slightly lower-pitched.
Super-high audio sample rates also have an interesting creative use. If you’ve ever
lowered the pitch of a standard 44.1 kHz audio file, you’ve probably noticed the
highs become somewhat empty. Frequencies above 22.05 kHz were filtered out
before conversion, so there is no frequency content to pitch down, resulting in a
gaping hole in the highs.
However, if this audio were recorded at 192 kHz, for example, frequencies of up to
96 kHz in the original audio would be recorded. This is obviously way outside of
what humans can hear, but pitching the audio down causes these inaudible
frequencies to become audible. As a result, you can greatly drop a recording’s pitch
while preserving high-frequency content. For more information on audio sample
rate, be sure to check out the video below.
What is audio bit depth?
The audio bit depth determines the number of possible amplitude values we can
record for each audio sample. The higher the bit depth, the more amplitude values
per sample are captured to recreate the original audio signal.
The most common audio bit depths are 16-bit, 24-bit, and 32-bit. Each is a binary
term, representing a number of possible values. Systems of higher audio bit depths
are able to express more possible values:
16-bit: 65,536 values
24-bit: 16,777,216 values
32-bit: 4,294,967,296 values
Higher bit depths mean higher resolution audio; if the bit depth is too low, some
information of the original audio signal will be lost. With a higher audio bit depth
—and therefore a higher resolution—more amplitude values are available for us to
record. As a result, the continuous analog wave’s exact amplitude is closer to an
available value when sampled. Therefore, a digital approximation of the amplitude
becomes closer to the original fluid analog wave.
16-bit: 65,536 amp. values
24-bit: 16,777,217 amp. values
32-bit: 4,284,967,296 amp. values
Increasing the audio bit depth, along with increasing the audio sample rate, creates
more total points to reconstruct the analog wave.
However, the fluid analog wave does not always perfectly line up with a possible
value, regardless of the resolution. As a result, the last bit in the data denoting the
amplitude is rounded to either 0 or 1, in a process called quantization. This means
there is an essentially randomized part of the signal.
In digital audio, we hear this randomization as a low white noise, which we call
the noise floor. Like the mechanical noise introduced in an analog context or
background noise in a live acoustic setting, digital quantization error introduces
noise into our audio.
Harmonic relationships between the sample rate and audio, along with the bit
depth, can cause certain patterns in quantization. This is known as correlated noise,
which we hear as resonances in the noise floor at certain frequencies. Here, our
noise floor is actually higher, taking up potential amplitude values for a recorded
signal.
However, we can perform artificial randomization to make sure these patterns don’t
occur. In a process called dithering, we can randomize how this last bit gets
rounded. Patterns are not created, creating more randomized “uncorrelated noise”
that leaves more potential amplitude values.
The amplitude of the noise floor becomes the bottom of our possible dynamic
range. On the other side of the spectrum, a digital system can distort if the
amplitude is too high when a signal exceeds the maximum value the binary system
can create. This level is referred to as 0 dBFS.
In the end, our audio bit depth determines the number of possible amplitude values
between the noise floor and 0 dBFS.
Can you hear the difference between audio bit depths?
You may be thinking, “Can human ears really tell the difference between 65,536
and 4,294,967,296 amplitude levels?”
This is a valid question. The noise floor, even in a 16-bit system, is incredibly low.
Unless you need more than 96 dB of effective dynamic range, 16-bit is viable for
the final bounce of a project.
However, while working on a project, it’s not a bad idea to work with a higher
audio bit depth. Because the noise floor drops, you essentially have more room
before distortion occurs—also known as headroom. Having this extra buffer space
before distortion is a good failsafe while working and provides more flexibility.
For more information on audio bit depth, be sure to check out the video below.
What should my sample rate and bit depth be?
For music production try a sample rate of 48 kHz at 24 bits. This strikes a nice
balance between quality, file size, and processing power. However, the right
sample rate and bit depth will ultimately depend on what medium of distribution
you're mastering your audio for.
Summary: sample rate vs bit depth
In summary, sample rate determines the number of snapshots taken to recreate the
original sound wave while bit depth determines how many amplitude values each
of those snap shots contain. Together bit depth and sample rate work together to
determine audio resolution. You should try producing at the highest values possible
and later bounce your high fidelity master to a bit depth and sample rate suited for
the intended medium of distribution.