An Introduction to the Concepts of Digital Audio
Digital audio refers to the computer (digital) representation of sound.
To obtain a digital representation of an acoustic (real world) signal:
SOUND ‐‐‐‐> MICROPHONE ‐‐‐‐> LOWPASS FILTER / ADC ‐‐>
‐‐‐‐> DIGITAL REPRESENTATION / STORAGE (i.e. binary 1011 etc) ‐‐>
‐‐‐‐> LOWPASS FILTER / DAC ‐‐‐‐> AMPLIFIER / SPEAKER ‐‐‐‐> SOUND
The microphone transduces (i.e. converts energy from one form to another) the
changes in air pressure into analogous changes in electrical voltage. The analog to
digital converter (ADC) converts analog signals (continuous time signals) into digital
signals (discrete time signals). The analog signal is sampled at equally spaced
points in time, a specific number times per second: the sampling frequency,
measured in Hz / kHz. The closer the space between consecutive samples (i.e. the
higher the sampling rate/frequency), the more the sample sequence retains the
shape of the original sound waveform. The A‐to‐D converter produces finite‐
precision numbers as samples of the input signal (quantized to a limited number of
bits). The digital audio signal is stored on the computer as a series of 1’s and 0’s –
in a digital audio file. Provided this file remains uncorrupted and not edited or
altered, the digital recording will be identical each time it is played back. The digital
to analog converter (DAC) converts the digital signal into a continuous‐time (analog)
signal. This system takes the finite precision binary numbers in the sequence and
fills in (interpolation) a continuous‐time function between the samples. The
resulting continuous time signal is the fed to other systems, such as amplifiers,
loudspeakers, and headphones for conversion into sound.
The fidelity of a digital audio signal is determined primarily by two factors: the
sampling rate, which determines the accuracy of the frequency content and the
quantising accuracy of the individual samples, which determines the dynamic range.
Increasing either or both of these leads to an increase in the amount of digital data
that is required to code this higher resolution representation of the sound: digital
The higher the sampling rate the more samples the ADC takes and the more
accurate the digital representation of the sound will be. Common sample
frequencies in audio production are: 44.1kHz (CD quality) 48kHz and also 96kHz
and 192kHz. The sampling resolution determines the precision of the measuring
scale used to store each sample e.g. 8‐bit, 16‐bit, and 24‐bit.
SAMPLING and ALIASING
The Shannon/Nyquist Sampling Theorem* describes how frequently an analog signal
must be sampled in order to retain enough information to reconstruct the original
analog signal from its samples.
It states that: when the sampling rate is greater than twice the highest frequency
contained in the spectrum of the analog signal (continuoustime, real world signal),
the original signal can be reconstructed exactly from the samples.
If the input sound is a pure tone (i.e. a sound of a single frequency), the
Shannon/Nyquist theorem states that its accurate reconstruction is possible if there
is a minimum of two samples per period (cycle).
The frequency at half the sampling rate is called the Nyquist frequency. Audio CDs
use a sampling rate of 44.1 kHz for storing sound in a digital format: frequencies
higher than the Nyquist frequency of 22.05Hz cannot be accurately represented.
Graph of a 100Hz cosine wave sampled at 2000Hz: 20 samples taken at each cycle
of the 100Hz wave
0 10 20 30 40 50 60
* Harry Nyquist and Claude Shannon were researchers at Bell Telephone
Laboratories who made important contributions to the theory of sampling and
digital communications circa 1920 – 1950.
Graph of a 100Hz cosine wave sampled at 500Hz: 5 samples at each cycle
0 2 4 6 8 10 12
Aliasing occurs when not enough samples have been taken to correctly represent
the sound. When the sampling frequency is less than twice the highest frequency in
the sound the signal is undersampled.
As previously mentioned the digital representation of a sound consists of a series of
numbers (samples) taken at regular time intervals (sampling period), with each
number/sample representing the instantaneous amplitude of the signal at that
moment. The resolution of these samples determines the overall dynamic range of
the sound and also affects the audio quality. The number of bits (i.e. binary digits,
1&0s) used to represent each sample value determines how accurately the
amplitude of the signal is measured. For example 8‐bit samples allow a range of 256
(28) possible values and 16‐bit samples allows 65536 values. Allowing more bits for
sampling the audio function increases the range of amplitude values that are used to
represent the sound signal.