If you don't understand gain structure, you may be undermining your recordings and mixes without even realising it.
Despite the immense power and flexibility available in modern digital audio worstation software, many people still find that the mixes they craft entirely 'in the box' sound unsatisfying. Why is that? There are obviously many things that can go wrong, and Mike Senior's Mix Mistakes article in SOS September 2011 (/sos/sep11/articles/mix-mistakes.htm) did a better job than I have space for here of exploring many common pitfalls. In this article, I'll focus on one fundamental issue that blights many tracks sent to SOS for help or advice: poor management of levels throughout the signal chain. In other words, poor gain staging.
Why The Problem?
On the face of it, gain staging couldn't be simpler: you ensure that you feed an appropriate level from the first stage of your signal path to the next, and repeat this from the second stage to the third... and so on, all the way from your instruments, mics and preamps to the final stereo mix bus. By 'appropriate', I mean an ample level, which ensures a healthy signal-to-noise ratio (the difference between the wanted signal and the noise floor), while leaving enough headroom that you needn't worry about whether the signal might be clipping.
I've seen plenty of online discussions in which seasoned engineers are shocked by the ignorance of many would-be engineers about managing signal levels. I'm not surprised: to anyone who learned to mix with analogue consoles and outboard gear, this is all second nature, because the gear effectively forced good practice on you.
Analogue Headroom
Even when an engineer is hitting the maximum level on the meters of professional analogue gear, there's bags of headroom left to accommodate stray high-level peaks, or particularly loud passages: if your signal is measuring 0VU on an analogue meter, then, assuming it's a professional-standard device, where 0VU = +4dBu, there'll be at least 20dB of headroom, and often considerably more; the signal can peak well above 0VU and cause no problems. Even with particularly high peak-to-average ratio sources such as a loud snare drum, the peak may be 'gracefully' clipped by the analogue circuitry — in the digital domain, by contrast, clipping produces anharmonic distortion which sounds very ugly.
Furthermore, at one time, much of the signal processing (EQ and sometimes compression) was built into consoles, and the levels leaving one stage had been designed to 'just work' with the next, from a user point of view; you'd have to try very hard to get unacceptably high levels on the main mix bus. Even where outboard processors were employed, they would be plumbed in and the meters calibrated, so that when you were mixing you didn't need to think about anything but mixing. When cheaper 'project' consoles such as the Mackie 8 Buss came along, there might have been a little less headroom than in the Neves, APIs and SSLs of this world, but the situation was essentially the same.
Different In The DAW
In the modern software-based production environment, though, many people don't completely appreciate what appropriate levels are, or even correctly understand the notions of headroom and dynamic range. This is understandable: after all, at least one generation has now grown up trying to teach themselves mixing using no more than a computer, a piece of software and the vast resource of conflicting, variable-quality advice on the Web.
Bad advice aside, I believe there are also several reasons why the DAW software itself leads people to make mistakes. Partly, I think it's due to the huge amounts of 'internal headroom' offered by modern 32- and 64-bit floating-point software — you could theoretically describe a dynamic range of a whopping 1500dB in a 32-bit floating-point system without causing problems. That enables you to apply serious amounts of gain if you wish — in theory! In practice, many people, including me, believe that the summing engines in different DAWs don't always produce the same results when summing lots of very high-level signals, and that these differences can be audible. The probable reason for this, to put it simply, is down to differences in how the 32- or 64-bit floating-point calculations are rounded to create the 24-bit audio part of the floating-point data.
More commonly, problems arise when you introduce plug-in effects and processors, because some are just not designed to operate well when presented with very hot signal levels. I'm not referring exclusively to cheap plug-ins, either: there are some fantastic-sounding plug-ins by the likes of Waves, Slate and Sonnox that audibly suffer if you overload them.
Analogue-modelling plug-ins, in particular, seem susceptible to this sort of abuse: they often do a great job of mimicking the sonic behaviour of their hardware equivalent in its normal operating range, but start to fall down when you push them outside that range. My assumption (which I've yet to verify) is that this is because modelling the increasingly complex non-linearities of analogue gear as you feed hotter and hotter levels into it takes up so much DSP power. If the tool is used as intended, there's just no need to model that behaviour accurately, and users will often prefer a more resource-efficient plug-in in any case.
Happily, most plug-ins offer input and output level controls, and often meters too, so you can boost or attenuate at any stage. What this all boils down to is that, despite the apparently forgiving nature of 32-bit and 64-bit floating-point processing, you still need to be aware of and manage the levels at every stage of your mix, and to be aware of the impact these levels have on your software's sonic performance.
Keep It Down!
The good news is that all of these problems can be avoided if you work with audio at sensible levels in the first place! In other words, if you adopt the analogue approach and leave plenty of headroom right the way through your digital signal path, everything should be just fine. Meanwhile, if you find that you need things louder when you're mixing, use the volume knob on your monitor controller, amp or speakers, because it's perfectly possible to increase the loudness of a mix during a mastering process, but it's not possible to rescue an overloaded mix by turning it down. Working in this way, when you find that you need to raise a channel fader, apply an EQ boost or do whatever other process that adds level, you have room to do so without needing to think about it. In fact, that's a good way of looking at this whole subject: gain-staging is about providing you with a safety buffer.
Why Overcook?
There's nothing inherent in 32-bit and 64-bit DAW systems that encourages people to use ridiculously high-level signals. So why do they? Obviously no-one's told them, or the message hasn't got through, but quite apart from that I can see some factors that might convince someone that it's a good idea. One is about adhering to outdated practice, while the others relate to the DAW software itself.
First, there's the archaic practice of recording levels as hot as possible, in order to maximise the signal-to-noise ratio of the digitally captured audio. This was important in the days of 16-bit digital recording, where the dynamic range of the recording medium was relatively restricted, and when converters didn't apply dither correctly — or even at all! It's certainly not the case now: DAW projects pretty much all default to 24-bit recording and I'm not aware of a modern DAW that doesn't at least offer that option. Digital recordings at 24-bit resolution offer a greater dynamic range than most users need — ie. the digital system's noise floor is so far below the wanted signal that you just don't need to think about it any more. It's easy to achieve an ample signal-to-noise ratio while leaving bags and bags of headroom. This advice is more commonly given now, but it's not that long since I heard more than one music production tutor offer poor advice on this point, so it's well worth repeating!
In case you're unconvinced, let's compare dynamic range in the digital and analogue realms. A top-notch analogue mixer has a noise floor around -95dBu, or maybe a little better, and clips at around +24dBu, giving it a total dynamic range of around 119dB. A typical mid-budget computer interface will deliver exactly the same dynamic range, and even budget ones will manage 115dB or more. The very best might make 125dB, which is a little better than most really good analogue systems. So, since analogue and digital systems have exactly the same dynamic range, there's no problem in using them in exactly the same way, with the same headroom margins, to deliver the same kind of sound quality.
In the case of the DAW software, I can see two major issues. The first is 'shouty' presets. Load a virtual synth in your DAW and run a programmed MIDI loop through it with notes set to a velocity of 100. The chances are that the meters are already into or approaching the red. The sound won't clip its own DAW channel — so it won't cause problems if you're just using the synth to do a live performance, for example — but if you run this patch through some plug-ins, or route multiple such patches through your mix bus and any mix-bus processing, you can easily run into problems.
Many people like to start a track with a software drum machine and overdub other parts as they build a composition and mix, all as part of the same process. If that's you, lesson number one is to use the output-level control on your software synth to turn it down to a sensible level and claim back that headroom! Don't use your mixer faders for this, because they won't affect the level flowing from your synth to any insert processors you add. Again, if this sounds too quiet, turn up your monitors.
Related to this issue is the use of mastered mixes as mix references. Using references is a great way to train your ears, become accustomed to your speakers and help get your mixes into the right sonic ballpark. However, the mastering process strips off the headroom margin that was present during tracking and mixing. Anyone unaware of this might assume that the aim is to produce a raw mix that hits 0dBFS and has an average level up around -6dBFS, or whatever! This is most definitely not the case. Lower the level of your reference CD tracks to regain that headroom and make them more comparable with your mix level.
Misleading Meters?
The second way in which DAW software can mislead is in its digital sample-peak metering. That's used by default in all DAW mixers (although, thankfully, progress is at last being made in meaningful loudness metering on the stereo mix bus). The sample-peak meter indicates the amplitude of the highest audio sample at any moment in time, and provides an approximation of the actual peak level of the reconstructed audio waveform. The approximation is perfectly adequate for use on the stereo bus, or any channel where you plan to send the signal out into the analogue domain — if you're working with a sensible headroom margin — since any amplitude errors are non-critical.
I'm not entirely sure why DAWs feature large sample-peak meters on the main mixer channels at all these days, though, unless you use them while tracking. They're only really of any use when mixing if you fail to leave sufficient headroom in the first place, and they provide you with very little useful information that a couple of LEDs couldn't, yet they take up vast swathes of screen real-estate! It occurs to me that the sample-peak meter is a throwback to the early days of digital recording, when, as I've mentioned, it was necessary to record as 'hot' as possible without clipping — and that no-one has thought to abandon or replace them since that time.
The point is that this sample-peak metering is very different from the metering you'd find on a typical analogue console. Analogue meters, whether they take the form of a moving-coil device or a stack of LEDs, are always 'integrating' types of one sort or another: they're designed to display the average level of a signal calculated over a defined window of time, and the analogue electrical and mechanical components mean that they effectively have a built-in 'attack time'. They're deliberately designed not to react to the briefest transients that would register on a sample-peak meter — in the analogue domain, these are not a problem, because of all the headroom. The VU meters of analogue consoles calculate that average over a relatively long time window (about 300ms), and even though the PPMs beloved of broadcast professionals use a much shorter window (about 10ms), they still don't display true peaks, which is why they've often been referred to colloquially as 'fake peak' or quasi-peak meters. The benefit of both these types of meter — at least in relation to digital sample-peak meters — is that they provide a more effective indication of how close or far away the signal is to/from a suitable nominal level. (Note that this is different from 'loudness' metering of your stereo bus output, discussed in the TC Electronic LM2 review elsewhere in this issue.)
Metering within DAWs can also mislead if you're unsure exactly what is being metered! Your DAW meters default to showing the level at a certain point in the signal chain, and different DAWs have different defaults: the channel's input signal (ie. before any insert processing) may be being monitored, or the signal could be post-panner (and post-inserts) or post-fader. It's entirely possible for your meters to look OK but for you to be feeding massive levels to the mix bus — if, for example, you've used compression with make-up gain or have applied EQ boosts.
Most DAWs allow you to change where the meter signal is tapped from, and some allow more options than others. What none do, as far as I'm aware, is to indicate the level between the insert slots. If that's what you want, you'll need to use the meters on your plug-in processors and effects, or dedicated metering plug-ins. That's a shame, as you don't have any indication at a glance from your DAW's mixer screen without having the plug-in GUIs open. I've often thought it might be useful to have an (optional) input clip light on each DAW insert slot and at the top of each channel, not as a lazy means of protection, but as an aid to tracking down problems more quickly.
In short, then, when it comes to metering, the digital sample-peak meters of your DAW's mixer window tell you very little. If you leave headroom, just as you would in the analogue domain, you don't really need them at all.
Setting Initial Levels
The advice in this article all boils down to the need to allow headroom on all your DAW channels, but, due to the difference in metering types used in the analogue and digital domains, this can seem more complex in practice than it sounds. You could, if you wished, insert virtual VU or PPM meters on every channel of your mix, but I'm unaware of a DAW that allows you to change the channel metering in the mixer to an averaging type, so there's no way that you can see at a glance what all those insert meters are doing. We have to work with the tools at our disposal, although if you're working entirely in the box you don't need to mimic the analogue signal flow so precisely. You don't even need to think about the headroom of your interface's A-D or D-A converters, except in as much as it interfaces with your monitoring chain (another subject about which you'll find half-baked advice all over the Internet!). However, you can make use of those almost-useless peak meters.
If you take the sound with the highest peak levels and set it so that it peaks at between -12 and -18dBFS, you shouldn't run into problems with plug-ins or summing on the mix bus. If these figures look different to those I discussed in relation to analogue gear, remember that the meters are different, and you're actually leaving about the same 20dB headroom. I wouldn't expect to see peaks on the channel meters reading more than -8 to -10dBFS (and lower is often better). The highest level track in your mix will vary, but in rock and pop it's usually the kick or snare. Set a rough balance of your other tracks in relation to that track and you should be good to go.
There are various means of setting that initial level and which is best will depend on the DAW you're using. Some allow you to select and boost/attenuate all the clips in the project's arrange window. I find that a good tactic when working only with audio files, as all the mixer controls remain neutral, but it won't take account of any virtual instrument channels. In some DAWs, there's a dedicated gain control at the top of the console channel strips, and you can set each track to the desired level, or gang the controls to preserve the balance while tweaking. In others, you'll find that you need to insert dedicated gain plug-ins in the first insert slot of each channel, and adjust them to set an appropriate signal level. In those cases, linking the controls may be more difficult. In extreme cases, where you have a project in which all the sources are incredibly hot, you might think about batch processing the files by lowering all the faders and bouncing the results in place (freeing up an insert slot in the process), but I wouldn't suggest doing this as part of your routine mix project setup.
You still need to think about your plug-in chain. Bear in mind that shouty presets are offered by many plug-in processors: I've observed plenty of decent plug-ins where the designer has craftily added a dB or two to the output even when it's not processing anything (we're conditioned to think that louder is better). Just aim for broadly the same level coming out of a plug-in as is going into it, and you won't go far wrong. In other words, try to stick close to the 'standard operating level' between plug-ins, which means peak sample-meter readings of around -10dBFS. You can bypass each plug-in in turn to check levels at each stage. Given that you have a little headroom, a wee bit of gain here and there from an EQ boost or something like that won't matter, but don't limit the bejesus out of a source and apply so much make-up gain that you need to drag your channel fader right down.
Set Faders To Stun
Most DAW faders are designed to have greater 'resolution' around the unity-gain position (their default position in most project templates), in the sense that small fader movements in this region produce small gain changes, while elsewhere in the fader's travel, small movements produce larger gain changes. (So this is really about manual tactile controllability, not 'mix resolution' as such.) If you plan on mixing by tweaking faders, either with a control surface or a mouse, your initial aim is to work towards a static mix balance that leaves all your faders at or around this unity-gain position — simply because it will give you much greater control when you need to tweak the faders as the mix progresses.
To do that, I suggest that when you set your initial channel input levels, you do so with all your faders at or around the -6dB mark. Why -6dB? Well, as the mix progresses, you'll often want to nudge the odd track up here or there to reveal it in more detail at certain spots in the mix. Starting with the fader at -6dB means that when you apply those tiny boosts, the part you're trying hardest to draw the listener's attention to has its fader right in the region where it's most controllable. (If you plan on mixing entirely with mouse-created automation, this tactic is less critical.)
Hybrid Setups
So far, I've assumed that you're doing all your mixing in the box. However, many people like to incorporate some outboard gear in their DAW setup, creating a 'hybrid' system, as some people call it. For example, I'll often use a hardware compressor on individual channels in my Cubase projects, employing the compressor as an External FX plug-in. In such scenarios, you should treat each channel in a similar way as you would your stereo mix bus: because you're routing your signal through your D-A and A-D converters, you need to avoid clipping like the plague! Note that your DAW's sample-peak meters will not catch 'inter-sample peaks,' where the true reconstructed waveform can reach 3dB or more above 0dBFS between the sample values (this is why the 'True-Peak' meter — which is an oversampling meter — was included within the ITU-R BS1770 loudness metering specifications). If you're only using the project with your own interface, that's less of an issue — because you should be able to hear any problems — but if you're sharing your project with bandmates or anyone else, it could cause headaches.
On With The Mix
Nothing I've described here is rocket science, and neither is it magic — you still need to put the hours and years into learning the craft of mixing. But understanding how to manage your gain structure could make a huge difference to the quality of your mixes — and avoiding fundamental mistakes early on in your mixing career will save you countless wasted hours and a lot of frustration as you learn.
Monitoring & Analogue Summing
Most professional converters are designed to produce +24dBu for a 0dBFS signal, which is the clipping level of most analogue gear. For those who mix 'hot', with average mix levels around -6dBFS, the gear is handling signals that are around 18dB hotter than it was intended to cope with, which is why it often sounds hard, brittle and strained. Keeping a 20dB headroom margin in the DAW avoids that problem and won't compromise the digital noise floor, because that is still 95dB lower — in other words, it's roughly aligned with the analogue noise floor. However, note that there's an inherent risk when working like this. If your DAW's analogue monitoring chain is set up expecting signals averaging around –20dBFS, it will come as a nasty assault on the ears if you import and play a mastered CD track peaking 0dBFS. For this reason, as well as for panic applications if the computer goes into howl-round and freezes, I urge the use of an analogue monitor controller with easily reachable volume and mute controls! Hugh Robjohns