AEC — Acoustic Echo Cancellation

Acoustic Echo Cancellation

For a deeper technical dive, see echo-cancellation.

AEC is a DSP algorithm that detects and removes the echo of the far-end participant's voice from the microphone signal before it is transmitted back to them. Without AEC, remote callers hear their own voice delayed by the acoustic path through the room — a confusing double-talk effect that makes full-duplex conversation impossible. AEC is the single most important audio processing algorithm in any conferencing system.

AEC operates using an adaptive filter that models the acoustic path between the loudspeaker and the microphone. The algorithm uses the loudspeaker signal as a reference signal — a copy of what will eventually appear as echo in the microphone after traveling through the air.

The cancellation process:

Far-end audio plays through the room loudspeaker
Sound travels through the air, bounces off room surfaces, and enters the microphone as echo
The AEC filter predicts what that echo looks like in the microphone signal, based on the reference and the learned acoustic model
The predicted echo is subtracted from the microphone signal
Remaining residual echo is further reduced by Non-Linear Processing (NLP)

The adaptive filter continuously updates its model using the NLMS (Normalized Least Mean Squares) algorithm, adjusting to changes in the acoustic environment such as open doors, room rearrangement, or temperature changes.

Tail length (filter length) is the duration of acoustic echo the AEC can cancel. It must exceed the room's RT60. A small conference room with RT60 of 0.4 seconds needs a tail length of at least 400ms. Large rooms with RT60 above 1 second need proportionally longer tail lengths. Insufficient tail length leaves residual echo that NLP cannot suppress.

Convergence time is how long the AEC filter takes to fully adapt to the room. On initial startup or after a significant acoustic change, the filter needs several seconds to converge — during which some residual echo may be audible. Well-designed systems save room acoustic models and re-converge faster on subsequent sessions.

Double-talk detection identifies when both near-end and far-end participants speak simultaneously. During double-talk, the AEC filter must freeze its adaptation — updating on a mixed signal would corrupt the acoustic model. Good double-talk detection is the difference between a system that handles interruptions naturally and one that clips voices when multiple people speak.

Dedicated hardware AEC (Biamp Tesira, QSC Q-SYS, Shure IntelliMix, Sennheiser TeamConnect) runs on purpose-built DSP with known, consistent latency. These systems support long tail lengths (500ms-1000ms), integrate with the microphone array signal processing pipeline, and are tuned for large-room acoustics. They provide the most reliable echo cancellation in professional installations.

Software AEC runs in the Zoom, Teams, or Webex client on a general-purpose CPU. Quality varies by platform and CPU load. Teams AEC is generally strong; lesser-known platforms may struggle. Acceptable for huddle spaces and BYOD rooms where dedicated DSP is not justified.

Double AEC — hardware AEC in the room system AND software AEC in the conferencing client simultaneously — is a serious installation mistake. Each AEC stage treats the other's output as echo to cancel, resulting in clipped, robotic, or completely suppressed audio. Only one AEC stage should be active per audio path. QSC Q-SYS, Biamp, and Shure all publish guidance on which layer to disable.

RT60 <= 0.5 seconds — Long reverberation creates echo tails that exceed filter lengths
Microphone at least 3 feet from loudspeakers — Close proximity creates high-level echo that overwhelms the filter
Stable speaker output level — Volume changes faster than the filter can adapt create transient echo breakthrough
NC-30 or lower ambient noise — High noise floor masks the echo reference signal and degrades filter accuracy

In very reverberant rooms (atria, tile surfaces, stone churches), AEC effectiveness degrades significantly. Physical acoustic treatment is a prerequisite for AEC success. See room-acoustics-basics.

Beamforming microphone arrays and AEC are complementary. The beamformed output feeds into the AEC as the near-end input. Because beamforming already attenuates the loudspeaker by 15-25 dB through spatial filtering, the AEC has significantly less echo to cancel, improving residual echo performance and convergence stability. See beamforming-mics.

Persistent echo despite AEC enabled — Most often caused by gain structure errors. If the loudspeaker is driven too loud, echo exceeds what the filter can subtract. Reduce speaker volume or add a limiter. See gain-structure.
Voice suppressed during near-end speech — NLP aggressiveness too high treats soft speech as residual echo. Reduce NLP aggressiveness; accept slightly more residual echo in exchange for natural voice reproduction.
Double AEC from hardware DSP + software client — Disable AEC in the conferencing client when dedicated hardware AEC is in the signal chain. Symptoms: voices cut out during conversation, especially during simultaneous speech.
Tail length too short for the room — A 200ms tail length in a 60-foot room with RT60 of 0.8 seconds leaves 600ms of echo uncanceled. Increase tail length in DSP configuration.
AEC reference signal not connected — The loudspeaker output signal must be explicitly routed to the AEC reference input in the DSP signal flow. Without the reference, AEC does nothing. Verify reference routing — this is the most common commissioning oversight.

AEC — Acoustic Echo Cancellation

How AEC Works

Key AEC Parameters

Hardware vs. Software AEC

Room Acoustic Requirements for Effective AEC

AEC in Beamforming Systems

Common Pitfalls

Related