Many DJs know all about different types of music, but it’s not too uncommon to find DJs who know absolutely nothing about how sound actually works. In this two-part series, we get a little nerdy and offer an introduction to basic audio electrical engineering for DJs. At the end of it, we hope you’ll understand a bit better what’s going on “under the hood” of your DJ hardware! This first part covers how analog sound becomes digital audio – and what sample and bit rates really are.
This two-part article series was crafted by guest contributor Ryan Mann, an Energy Engineering student at UC Berkeley.
To start things out, let’s talk about the difference between the analog sound coming out of speakers, and the digital music files in your computer:
Sound itself is the propagation of pressure waves through a compressible medium. These acoustic waves correspond to continuous, smooth analog electrical signals produced or generated by a speaker or microphone. These signals can be decomposed (using something called the Fourier Transform) into a combination of sinusoids (elements of the signal) with different frequencies, amplitudes, and phases.
Frequency is the duration of each cycle, measured in Hertz (Hz), where 1 Hz is 1 cycle per second. The human ear can hear frequencies between 20 Hz and 20,000 Hz (20 kHz), and each part of your ears’ cochlea responds to a different frequency.
Amplitude is the height of the wave, or the maximum pressure during the cycle. Pressure is measured with a bunch of different units, and the amount of power required to produce a certain amplitude over an area is measured in Watts (W), but volume is usually measured in decibels (dB), which compares the power of the sound to a power of 0.001 W using a logarithmic scale.
Phase is the timing of one wave relative to another. If there are two waves with the same frequency, they may or may not reach their peak at the same time. Phase measures the time between the peak of one wave and the peak of another, if any.
Digital Music Files are not stored as continuous analog signals, but as a large number of digital bits able to be read by a computer (in binary; 1 or 0). Each 1 or 0 is called a “bit.” 8 bits make up a “byte.”
Analog-to-Digital Conversion (ADC)
Analog sound is converted to a digital file when recording live instruments, or when recording a mix from an analog mixer into your computer. The Analog-to-Digital Conversion (ADC) process occurs as follows:
- There is some sort of physical phenomenon to be measured – in this case, sound (or pressure).
- A sensor (in this case, a microphone) converts the sound into a continuous voltage signal.
- The ADC sound card samples this signal, sorts the sampled value into a “bin”, and records this value as binary “bits.”
Digital-to-Analog Conversion (DAC)
For the opposite process, audio starts as music file stored as digital bits. A DAC soundcard (like in a mixer, DJ controller, etc) converts the digital voltage values
into a slightly “choppy” continuous signal (Author’s update: as commenters have noted, DAC output is not “slightly choppy” – the DAC uses a sinusoid to reconstruct the signal, so it should be perfectly smooth.) Then, an actuator (here, a speaker) converts the signal into sound.
The quality of the output is determined by the quality of the original file, and the resolution of the DAC in the soundcard – for example:
- playing a YouTube rip on a cheap Chromebook laptop might mean a 96 kbps file and a 16-bit, 48 kHz soundcard
- alternately, playing a lossless FLAC file on the brand-new CDJ-TOUR1 would mean a 1411 kbps song and a 24-bit, 96 kHz Wolfson DAC
The number of bits used in analog-to-digital conversion determines the accuracy with which the digital signal replicates the actual analog value at any given point.
For example, a 1-bit ADC (pictured above) is only capable of recording a 1 (full voltage) or a 0 (no voltage). So anything in between has to get “rounded” into the closest available category, meaning there will be a large jump at the halfway point.
For every bit that’s added, the resolution (number of steps) doubles, so it’s possible to get a nearly perfect representation of the signal with only a relatively small number of bits:
- 1 bit = 2 steps
- 3 bits = 8 steps
- 16 bits = 65,536 steps
- 24 bits = 16,777,216 steps
Time resolution (the sampling rate) is just as important as amplitude resolution (the number of bits). A higher sampling rate means that the original analog signal is replicated more accurately.
But if the sampling rate is too high, it uses a lot of processor power and memory. So what’s the right sampling rate to use? Nyquist-Shannon Sampling Theorem states that the sampling rate must be at least twice the frequency of the highest frequency found in the signal.
In the above diagram, you can see how 1 sample per cycle would not be enough to capture what’s going on, and 2 samples per cycle would be the minimum for recording the signal accurately. At two samples/cycle, the samples must be at the min and max points of the wave, otherwise it will record the right frequency but wrong amplitude and phase.
Sampling at higher than 2 samples / cycle smooths out the waveform and increases the quality of the audio – but at the cost of more processor power and storage space.
CD-quality audio is recorded at a little higher than the Nyquist minimum sample rate for the highest frequency that people can hear. (The maximum audible frequency for most people is 20 kHz, and CDs are recorded at 44.1 kHz).
Bitrate + Compression
Most digital audio files are not as high-resolution as a lossless audio file like those found on a CD.
There are a number of different algorithms and file formats to compress the audio file in order to make it easier to transfer and store. There are a number of different algorithms and file formats used to compress the audio file in order to make it easier to transfer and store. It is possible to reduce file size without losing any information – this is referred to as lossless compression, and is the central mechanism behind both the FLAC audio file format and the fictional Pied Piper algorithm from “Silicon Valley.” However, further file size reduction typically requires lossy compression; the challenge is to reduce the file size while retaining as much audio quality as possible.
In general, this means reducing the sampling rate and bits. This means that higher-frequency sounds are not converted accurately, or are filtered out entirely to prevent aliasing.
The bitrate of a song (measured in kbps – kilobits per second) is equal to the number of bits, times the sampling rate. For instance, the bitrate of a CD-quality lossless file is equal to 16 bits x 44.1 kHz x 2 stereo channels, or 1411 kbps. Some DJs swear by only playing lossless WAV or AIFF files, but it’s generally fine to play 192 kbps VBRs (variable bit rate), 256 kbps AACs (iTunes Store purchases), or 320 kbps MP3s (cheapest Beatport option, most downloads) even on a large club speaker system.
Recordings that are 128 kbps and below should generally be avoided. This is the sound quality of a YouTube or SoundCloud rip, or an old Napster download. Stay away!
Read more: A DJ’s Guide To Audio Files + Bit Rates
In part two of this series, we’ll cover the inner workings of DJ equipment – how they manipulate digital and/or analog audio, and what filters and EQ knobs do on an electrical level. Stay tuned!
Ryan Mann is a guest contributor to DJ Techtools, and helps to lead elecTONIC, a student group at UC Berkeley dedicated to promoting electronic music.
Hi, really great two part article, it has given me a lot of background usefull information for a electrical engineering project I am working on at college (which I’m hoping you might be able to help with). I am looking to build a digital pitch controller to change bpm of mp3 files. I guess this involves manipulating a clock before/while the data is going through the DAC. Do you have any idea how I might do this? some help regarding components, micro’s would be great. Any reference for circuit diagram would be a brilliant. I think this would make an interesting extension to your articles as controlling he bpm is the foundation of DJing..
[…] Read Part 1 of this series, covering Analog/Digital Audio and Bit + Sample Rates […]
[…] Read Part 1 of this series, covering Analog/Digital Audio and Bit + Sample Rates […]
[…] Read Part 1 of this series, covering Analog/Digital Audio and Bit + Sample Rates […]
Great series! I can’t wait for the next one.
Thanks for a great article. .
Looking forward to part 2.
hello very interesting articles. Just you must mentioned a dac is not a neutral restitution because use some electronic and make warm in sound a neutral dac do not exist if you use a esoteric ( one of the best dac with antelope) or aurealic dac or teac dac or audio gd etc the same flac wav or aiff the result is not the same ( sabre texas or other cpu) same case of hi fi equipment. Some dac make over sampling for better randing (teac ud503) and i don’t understand notion of bit ? because dsd use 1 bit not ? 64 fs (1 fs = 44100 Hz, donc 44 100 Hz × 64 = 64 fs = 2,8224 MHz). sorry for my english i am french
Samples represent something different with DSD than they do in PCM, so 1 bit samples in DSD are not comparable to PCM with 1 bit samples.
[…] in an effort to help you in that task, I’ve sourced this great article on how analog sound becomes digital audio. The article also answers any misconceptions you might […]
This article perpetuates misunderstandings of how digital audio works. Sample rate and bit depth do not determine quality past a certain point. Sample rate determines the maximum frequency that can be represented without distortion. Bit depth determines possible dynamic range. That’s it. 16 bits @ 44.1 kHz is enough to represent all that humans can hear. There is no increase in quality past that point. Quality may actually decrease with extreme sample rates like 192 kHz. There are good reasons to record in 24 bits @ 96 kHz, but distributing finished music above 16 bits @ 44.1 kHz just wastes space.
For more thorough explanations, see:
The bit depth of a digital-to-analog converter does not determine
the quality of a sound card. A lot more separates good sound cards from
lesser quality ones. The distortion and noise added to the signal by the
sound card is much more important than the bit depth. The quality of the converter chips affects that, and so do other aspects of the design of the device that I don’t entirely understand. There are plenty of 24-bit sound cards that
don’t sound very good. In my experience, sound cards that costs less than $300 don’t sound very great except the Focusrite Scarlett sound cards.
There are a few other misleading aspects of this article:
1. Sound is not an analog signal, it is an acoustic signal. Conflating analog electrical signals with pressure waves is confusing.
2. The decibel unit by itself does not measure volume, sound pressure, or anything else. “dB SPL”, the physical unit commonly used to approximate volume, measures sound pressure (not sound energy) relative to the sound pressure approximately equal to the smallest that young, healthy humans can hear at 1 kHz (20 ?Pa). Without clarifying this, people will get confused when dealing with dB in reference to other values like dBV and dBu. Also worth noting is that sound pressure, a physical quantity that can be easily measured, is not the same as volume, the subjective perception of how loud a sound is.
3. As others have pointed out, it doesn’t make sense to compare bitrates of different codecs as an indicator of quality. It only makes sense to compare different bitrates of the same codec.
Thanks for the feedback, I’ll try to get this corrected.
True wisdom – I approve this message.
16 bit doesnt offer that much dynamic Range… when you put 3/4 bits for overhead you are left with ~60dB dynamic range.
As explained in the linked article, the actual dynamic range of 16 bit PCM is around 120 dB, which is wide enough to encode sound as quiet as the hum of a light bulb and sound loud enough to cause pain and hearing damage in less than a minute. Leaving plenty of headroom with 24 bit PCM is helpful for recording, production, mixing, and mastering, but after that it is a waste of space to keep it in 24 bit PCM.
It is very nice article .
Understanding the mechanism of audio , I believe that the same as knowing the good music .
I expect you to Part 2 of this article .
Great primer, but your bitrate math at the end is off. WAV/AIFF is most certainly 1411 kbps for 16-bit 44.1 kHz stereo UNCOMPRESSED audio. However, this math doesn’t work when dealing with compressed codecs.
Beatport mp3’s are encoded at 320kbps, but these files are most definitely 16-bit 44.1 kHz stereo. They are NOT 10 kHz. In fact MPEG-1 Audio Layer III only allows 32, 44.1, & 48 kHz sampling rates.
I think this article should concentrate on uncompressed signal paths and MP3/AAC/other compression can be saved for a later article.
Thanks, will update the article to remove those bullet points at the end.
Just so I’m clear, kHz indicates the number of sample points on a given waveform correct? So a CD would have 44.1 sample points per cycle, but a 320 mp3 file will reduce this to 10 sample points and a 256 aac file reduces it to only 8 points, and yet it’s still the same to the human ear? And the highest quality audio will go up to 96 sample points? Crazy!
44.1 kHz means 44,100 samples per second. For a very low (20 Hz) frequency, this would be 2,000 samples per cycle. For a very high (20 kHz) frequency, this would only mean a little over 2 samples per cycle.
The sample rate values I gave for the lossy MP3 files are more of an approximation. As Homicide Monkey says below, the MP3 compression process involves more than just reducing the bit depth and sample rate. It starts by identifying and removing the parts of the signal that the listener is least likely to be able to hear, either because it’s at a very high frequency, or because it’s masked by other parts of the song.
FLAC is lossless compression format. It reduces file size while not impacting the information that was captured. Most people actually can’t hear above 17kHz well. Keep in mind that MP3 and AAC is not the same type of data as what a WAV or AIFF file would be. The lossy file types use complex algorithms prior to sending data to the DAC. These algorithms use psycho-acoustic phenomena to help eliminate information that might not be 100% necessary to human perception. The really main benefit with lossless over lossy formats is in the ability to have more information for any additional algorithms like pitch shifting.
This! FLAC is a truly lossless compression. It will reduce the file size while keeping the output bit perfect. So yes, truly lossless compression formats exist (and there are many others for things other than audio).
Thanks – I’ll make sure to correct this.
btw: without meaning about the article
Great explanation. In the next article can you address the idea that resolution can not be truly regained?
It’s been my belief that if a kick drum becomes a digital sample and then is recorded to a record, it does not magically become an analog sample. Instead it’s an analog recording of a digital sample.
This goes to disprove all the “90’s vinyl records were so much better cause they were analog” argument because all the sounds on those records come from crappy quality digital samples on samplers (many times resamples of other samples).
By that rationale you are never hearing “analog” drums unless you’re sitting in front of someone playing the drums. Otherwise, if anywhere along the signal chain there is a DAC then you’re hearing a non-analog signal. I agree with that too btw, your statement is correct, those 90s vinyl records aren’t analog.
I do use that rationale but maybe my definition of digital is too strict. I mean you could get super technical and say tape is digital since is information stored on magnets which are either a positive or negative charge the same way as information is stored as 1 or 0 on a computer. Of course, this notion tends to bring on many eye-rolls.
Tape is not digital, since the magnetic particles are in flux. If you digitize an analog tape two times, the digitized information will never be 100% identical.
While the resolution of an analog signal might be close to infinite, the resolution of audio equipment is never even close. Systems have a spec called signal to noise ratio (SNR) that exists to roughly describe the “reasonable” resolution of them.
Modern digital AD converters run close to 120 dB SNR (120dB dynamic range is also around the limit of human hearing) when recording at 24 bits, easily being able to capture the dynamic range of most analog systems.
Any argument stating “x is better because analog” is plain ignorance and/or oversimplification. Yes, highend analog sounds good. But so does highend digital.
The signal of a DAC is indeed not choppy, and as long as the source does not contain signals with frequencies higher than half the samplerate, the output is mathematically identical to the input.
A must see to get a better understanding of this is this video tutorial:
Thanks for this Gwen. I learned a lot.
Thanks for the video – I’ll make sure to correct this.
DAC does the regression between the discrete points to make sure the output is continuous. But still Digital is “choppy” and Analog is not.
It is my understanding that a DAC does not output a “choppy” signal? The freq points are recreated with sinewaves in the DAC. This is a common misconception, often caused by “pixels on a digital image” analogies. Digital audio does not behave like bitmap images
I think you’re right – I was mistaken. See the video Gwen Rolants posted for further explanation.