ESP Logo
 Elliott Sound Products Project 183 

Signal Detecting Audio Ducking Unit

© 2019, Rod Elliott

Introduction

The term 'ducking' is based on the background signal dropping to a low level when someone needs to make an announcement, and has absolutely nothing to do with ducks .  These systems are commonly used in shopping centre PA systems, and are also common in radio broadcasting studios.  I have no way of knowing how useful readers will find this project, but a search for 'audio ducking circuits' brings up several images from my article on Muting Circuits (but not much else even remotely useful), and a reader wanted to know if any were suitable for ducking.  The answer was "no", because ducking and muting are very different processes.

Most ducking systems are automatic, so they react when a signal is received from a microphone or other source, such as emergency warning messages.  Because most shopping centre PA systems are used to provide what's laughingly referred to as 'background music', it's essential that the level is reduced (or sometimes muted) when there is an announcement.  Because the announcement is likely to be important, no-one wants the 'music' to interfere with what's being said, although it must be said that for some 'messages' not hearing them at all would be preferable.  Of course, you don't have to be involved in shopping centres or radio broadcasting to need a ducking circuit - it's just as useful for podcasts and similar activities.  Some users may find it useful as a sound effect.

There's almost nothing on the Net that describes circuitry suitable for ducking, and in ESP tradition, I like to ensure that readers have access to things that are otherwise obscure, or in some cases, missing altogether.  The circuit shown here will operate with a signal of 10mV (RMS), which will be adequate for the vast majority of applications, assuming that the announcement channel is already equipped with a microphone preamplifier that brings the level up to at least 100mV with normal speech.  It is possible to make it more sensitive - I tested a similar detector down to 1mV, but at this level even tiny amounts of mains hum or other noise will trigger the circuit.

Figure 0
Ducking Controller In Action

Although the drawing is a simulation, it's fairly close to what you'd see with a dual-trace oscilloscope.  The music plays normally as long as there's no speech detected, and when that happens, the music (red trace) is attenuated ('ducked') allowing the speech to be heard.  The music level during speech should be variable from almost zero up to perhaps 50% of its normal level.  This is always a decision based on particular requirements of user.  The detector must be sensitive enough to detect the speech level expected, and while some systems may offer variable attack and release times, the version shown here is fairly basic.  The release time (how quickly the music returns to normal level) is adjustable, but the attack time is fixed at 'fast'.

Depending on the application, the speech level may be less than that of the music, about the same, or much greater.  Again, this depends on the circumstances and the intent.  Sometimes, only a small reduction of the music level may be needed, in other cases (such as for emergency announcements) the speech level will be far greater than the background.

Using cheap and readily available parts, the unit will attenuate the background signal almost instantly it receives a signal from the designated channel(s), and by using an LED/LDR attenuator there are no clicks or other noises as it operates.  The attenuation is variable, as is the release time - the period after speech has finished before the background signal is returned to normal.  In most cases, this will only be one or two seconds, but it can be increased if necessary.  The use of an LDR as the attenuator ensures that the switching is 'soft' (a fairly fast turn-on, but a slower and unobtrusive turn-off) to minimise unwanted clicks or pops in the audio.  LDRs also have fairly low distortion, so the partially muted signal will remain clean at any level within the capabilities of the rest of the circuit.


Circuit Description

The method of 'ducking' the background signal is very straightforward.  The 'speech' input is normally silent (most announcement mics have an on/off switch), and the background signal simply passes through R3 and R4 to the mix input of U1.  When a signal is detected on the speech input (by the Figure 2 circuit), the LDR turns on, and bypasses some (or most) of the signal to ground, reducing its level as set by VR1.  The speech signal is passed straight through R2 to the mixer, and isn't attenuated.

The mixer has a nominal gain of -1 (unity gain, inverting) for both signals, but you may notice that the background channel actually has a very small gain (-1.1).  Feel free to change R2 and R3 to 11k if the small gain offends you .  Normal (and preferred) signal level for both inputs is around 1V RMS.  The signal inversion is of no consequence, and there's no requirement to use an inverter to restore normal polarity.

The optocoupler can be a commercial type (such as a VTL5C4 or similar), or you can make your own.  Project 200 has detailed instructions for making a DIY LED/LDR optocoupler.  If you need more attenuation, use two LDRs in parallel, with either a single LED for both LDRs (which might be a bit tricky) or two LEDs in series.  The VTL5C4 can get to 125Ω at 10mA LED current.  Operating the LED at higher current reduces the 'on' resistance, but isn't recommended.

Figure 1
Figure 1 - Ducking Control & Mixer

No microphone preamp is shown for the speech input.  If you need one, see Project 66, which is a high performance microphone preamp that can drive the speech input directly.  Otherwise, the speech input would typically be taken from the PA system's mic output for the designated mic input(s).  Multiple mics can be accommodated by using a mixer similar to that shown in Figure 1 (built around U1).  The background 'music' source passes through the Figure 1 circuit before going to the PA amplifier.

Figure 1 shows the ducking circuit and mixer.  The speech signal goes to both circuits, and the pot (VR1) controls the attenuation of the background signal when speech is detected.  The final attenuation depends on the LDR's minimum resistance, but most can get to well below 500Ω without too much trouble.  Some I tested managed 150Ω easily.  Since VR1 is in series with the LDR, that varies the attenuation.  When VR1 is at maximum resistance there is very little attenuation (about 1.1dB), and at minimum resistance there is over 27dB of attenuation.  This is based on using an LDR with a minimum resistance of 200Ω, but if it's lower than this, more attenuation is available.

U1 derives a reference voltage from the 5.1V supply (shown in Figure 2), and it's decoupled by R6 and C2 to ensure minimum noise.  The remainder of the circuit is conventional for single supply circuits.  While you can use a dual supply, it's not necessary for such a simple circuit.  Maximum signal level will be around 2V RMS with the supply voltages shown, and that's more than sufficient for PA amplifiers as used in most installations.

If you need greater attenuation, increase the values of R2, R3, R4 and R5.  If these values are increased to 47k, 22k, 22k and 47k respectively, the maximum attenuation is over 35dB, and it's highly unlikely that more would be necessary.  If you do require more than 35dB (for example to silence the background almost completely), then simply use two LDRs in parallel.  This can achieve better than 40dB attenuation, still assuming LDRs that can only get to 200 ohms.

The circuitry expects that both the speech and background signals are derived from a low impedance source.  It's very uncommon for them to be anything else these days, but you do need to make sure.  The output impedance of the two sources should be less than 1k.  The speech input must be taken from the output of a microphone preamp, as it's not designed to handle a mic signal without amplification.  While I've shown a TL071 opamp, you can use any other type as you prefer.  Given the usage of ducking circuits in general, it's unlikely that you need anything better.  A dual opamp can also be used, with the second half configured as an inverter, and that will provide a balanced output if that's necessary.

The schematic shown above is mono, so only one channel is processed.  Nothing more is needed for PA work, but if the ducking circuit needs to be stereo (video or podcast post-production for example), then simply build two of the Figure 1 unit.  The LEDs will ideally be in series, and VR1 will be a dual-gang pot, with one gang for each channel.  The background level will be the same for both channels because the LDRs are both turned on fairly hard and level difference will be small.


Signal Detector

The speech/ announcement signal detector unit is shown in Figure 2, and it uses an LM358 dual opamp and a handful of other parts.  The LED is driven by a MOSFET, selected because of the almost infinite input resistance.  This enables the unit to have a programmable time delay before returning the background signal to normal level.  The 2N7000 shown is recommended because it has a threshold voltage of less than 3V and is cheap and readily available.  Virtually any MOSFET will work just as well, even if the gate threshold is a little higher. Alternatives are BS170, BS270, VN2222, etc.  The opamp must be an LM358 (or similar) as shown. While you can use various others, the outputs of most common opamps cannot reach zero volts - the worst case minimum is about 2V. The LM358 is recommended because its output voltage goes to zero volts.

The circuit uses a reference voltage line (R13, D1 and C6, nominally +5.1V) to bias the opamp inputs and provide a comparator reference voltage. Since the same supply is used for both, regulation is not required as any variation will be applied both to opamp input and comparator, so the two will track properly over a wide voltage range. Voltages shown are typical - they could vary depending on the actual supply voltage. 12V and 5.1V as shown are nominal, and may be slightly different.

Figure 2
Figure 2 - Audio Detector And LED Switching Circuit

The signal feed is taken from the channel used for announcements, but more than one channel can be accommodated if necessary, typically by using a simple mixer stage of the same form as that shown in Figure 1.  The input voltage is amplified by 100 by U2A, and the output is supplied to the comparator U2B. When the amplified signal exceeds the comparator threshold of about 0.5V below the reference level (~4.6V), the output of U1B goes high momentarily, charging C7, so turns on Q1 and the LED in the optocoupler.  Verify that the voltage at the output of U2A (pin 1) is more positive than the voltage at the non-inverting input of U2B (pin 5).

The circuit will mute the background signal in less than one cycle of audio, and it's not expected that it needs to be any faster.  Because the circuit only operates on the negative half of the waveform, any initial positive signal is ignored.  This could be improved by including a full-wave rectifier, but that makes the circuit considerably more complex, and it's unlikely that the average user would ever notice the difference.  An 'attack' control isn't included because this usually needs to be as fast as possible, and there's no advantage to making it slower than already provided by the LDR (about 10ms for a VTL5C4).

Should it be found that the circuit is too sensitive (due to noise on the 'speech' input for example), increase the value of R11 - this reduces the gain of the amplifier, so more signal will be needed. Likewise, to increase sensitivity reduce the value of R11 - you could use a 10k trimpot (or as a front panel 'sensitivity' control) for a useful sensitivity range.  The comparator is triggered by negative transitions from U2A, so the output of U2A has to fall below 5.2V for the comparator to produce a high output.  Make sure that the voltage at the MOSFET gate is no more than perhaps 100-200mV or so when the output is supposed to be off. If the MOSFET turns on even very slightly, the optocoupler may not turn off properly and the background signal will be attenuated.

After the audio signal is removed, it will take some time for C7 to discharge through VR2 and R14, and the time can be varied from 300ms with VR2's wiper at maximum, up to 3.3 seconds with the wiper at minimum.  After the timeout, Q1 will switch off, and the background signal is gently restored to normal level due to the slow release time of the LDR.  The time can be varied by changing C7 - increase it to make the time longer and vice versa.  Because C7 will most likely be an electrolytic type, a low leakage type can be used to ensure the delay time isn't shorter than expected.  Don't use a tantalum caps in the circuit, as they are the most unreliable caps ever produced, and I never recommend them for anything.

The diode (D2) can be a 1N4148 or 1N4004, whichever is the easiest to find (or is already at hand). It's are not critical, so other types will be just as suitable (I shall leave this to the reader).  As noted above, the MOSFET's gate voltage must fall to less than 1V - ideally zero. Be careful, because even a small leakage current from the supply to the gate circuit may prevent the circuit from turning off the LED properly.


Multi-Zone Systems

In larger centres where there may be several different zones, a single detector circuit can drive multiple ducking circuits.  Simply duplicate the LED/ LDR optocoupler and current limiting resistor (R14), and a single announcement/ emergency signal can be used to duck as many separate zones as you need.  The 2N7000 MOSFET is rated for up to 200mA, which could (in theory) mean twenty separate zones.  However, I think that would be unwise, so I'd use no more than ten (100mA maximum current).

Should more be needed (doubtful but possible), either use a larger MOSFET or two (or three) 2N7000 with a common gate signal.  They won't change the release time because the gate draws no current, other than a tiny (typically 10nA, equivalent to ~1.2GΩ) leakage current which is nothing to worry about.  Otherwise, use a larger MOSFET - a IRF540 would work, and that's rated for 28A (does anyone need over 2,000 zones?).

The only other consideration is the driving capability of the announcement channel.  With ten zones, the effective input impedance is about 2.2kΩ which isn't a challenge.  The circuit shown in Figure 1 is duplicated for each separate zone, and a common announcement channel for all zones is assumed.  If the zones are completely separate, then duplicate the entire circuit (Figures 1 and 2).


Conclusions

There are several applications for this type of circuit, and there are likely to be some I haven't thought of.  The 'speech' signal doesn't have to be speech, as any audio signal source will work the same way, provided its level is greater than 10mV or so.  There is nothing critical about either part of the circuit, provided the circuit is followed closely.  You may find that the detector is too sensitive, and that's easily reduced as described.

About the only thing you may need to change is the maximum attenuation, and because that is determined by the characteristics of the LDR, increasing the resistor values as described will probably be sufficient.  You can also use a pair of LDRs in parallel to reduce the 'on' resistance.  Both can be illuminated by the same LED, which should be a high brightness type for best results.  In some cases you may prefer a longer release time, and that's easily accommodated by increasing the value of C7.  I don't recommend anything above 10µF, as that will increase the attack time (the time it takes to reduce the background level).

The circuit is shown using a single 12V supply, and that can be from a wall transformer supply or other source of 12V.  The voltage can be greater (for example if there is a source of a suitable voltage available from other equipment), and if this is the case the zener voltage can be increased.  It should be around half the supply voltage or slightly less, so if you have 15V available, the zener will ideally be around 6.8-7.5 volts.  Note that if the external supply voltage is not regulated, you may need to add filtering from the supply to minimise hum and noise.  You'll also need to increase the resistor in series with the LED (R14) to keep the LED current at around 10mA.


References

There are no references, because the circuitry is primarily based on other ESP projects and is an original design.  Some material is 'common knowledge' and/ or 'public domain'.  Some circuitry (that achieves much the same goal by means at least vaguely/ remotely similar) was located using a search, but was not used in any way, shape or form.  It seems that there are people looking for a suitable circuit, based on a few forum queries I came across, as well as the reader enquiry.


 

HomeMain Index ProjectsProjects Index
Copyright Notice. This article, including but not limited to all text and diagrams, is the intellectual property of Rod Elliott, and is © 2019. Reproduction or re-publication by any means whatsoever, whether electronic, mechanical or electro-mechanical, is strictly prohibited under International Copyright laws. The author (Rod Elliott) grants the reader the right to use this information for personal use only, and further allows that one (1) copy may be made for reference while constructing the project. Commercial use is prohibited without express written authorisation from Rod Elliott.
Change Log:  Page Created and Copyright © March 2019./ Update Feb 24 - added Fig 0 (waveforms).