Cascaded noise reduction and acoustic echo cancellation based on an extended noise reduction
Arnout Roebben, Toon van Waterschoot, Marc Moonen
TL;DR
The paper tackles the challenge of removing both noise and echo in multichannel speech capture by introducing an extended noise reduction stage (NRext) that precedes acoustic echo cancellation (AEC) under the assumption of additive echo paths. The NRext-AEC design preserves the additivity of echo paths, enabling the AEC filters to focus solely on echo paths while NRext suppresses both near-end and far-end noise components. The key contribution is the theoretical and empirical demonstration that NRext-AEC yields AEC independence from NRext and that NRext benefits from degrees of freedom that scale with the number of loudspeakers, improving both NR and AEC performance. The approach is validated via simulations with multichannel setups and a variety of acoustic scenarios, and code is made available to support replication and further research. This work offers a practical pathway to more effective speech enhancement in hands-free and speakerphone applications where multiple loudspeakers are present.
Abstract
In many speech recording applications, the recorded desired speech is corrupted by both noise and acoustic echo, such that combined noise reduction (NR) and acoustic echo cancellation (AEC) is called for. A common cascaded design corresponds to NR filters preceding AEC filters. These NR filters aim at reducing the near-end room noise (and possibly partially the echo) and operate on the microphones only, consequently requiring the AEC filters to model both the echo paths and the NR filters. In this paper, however, we propose a design with extended NR (NRext) filters preceding AEC filters under the assumption of the echo paths being additive maps, thus preserving the addition operation. Here, the NRext filters aim at reducing both the near-end room noise and the far-end room noise component in the echo, and operate on both the microphones and loudspeakers. We show that the succeeding AEC filters remarkably become independent of the NRext filters, such that the AEC filters are only required to model the echo paths, improving the AEC performance. Further, the degrees of freedom in the NRext filters scale with the number of loudspeakers, which is not the case for the NR filters, resulting in an improved NR performance.
