Table of Contents
Fetching ...

Cascaded noise reduction and acoustic echo cancellation based on an extended noise reduction

Arnout Roebben, Toon van Waterschoot, Marc Moonen

TL;DR

The paper tackles the challenge of removing both noise and echo in multichannel speech capture by introducing an extended noise reduction stage (NRext) that precedes acoustic echo cancellation (AEC) under the assumption of additive echo paths. The NRext-AEC design preserves the additivity of echo paths, enabling the AEC filters to focus solely on echo paths while NRext suppresses both near-end and far-end noise components. The key contribution is the theoretical and empirical demonstration that NRext-AEC yields AEC independence from NRext and that NRext benefits from degrees of freedom that scale with the number of loudspeakers, improving both NR and AEC performance. The approach is validated via simulations with multichannel setups and a variety of acoustic scenarios, and code is made available to support replication and further research. This work offers a practical pathway to more effective speech enhancement in hands-free and speakerphone applications where multiple loudspeakers are present.

Abstract

In many speech recording applications, the recorded desired speech is corrupted by both noise and acoustic echo, such that combined noise reduction (NR) and acoustic echo cancellation (AEC) is called for. A common cascaded design corresponds to NR filters preceding AEC filters. These NR filters aim at reducing the near-end room noise (and possibly partially the echo) and operate on the microphones only, consequently requiring the AEC filters to model both the echo paths and the NR filters. In this paper, however, we propose a design with extended NR (NRext) filters preceding AEC filters under the assumption of the echo paths being additive maps, thus preserving the addition operation. Here, the NRext filters aim at reducing both the near-end room noise and the far-end room noise component in the echo, and operate on both the microphones and loudspeakers. We show that the succeeding AEC filters remarkably become independent of the NRext filters, such that the AEC filters are only required to model the echo paths, improving the AEC performance. Further, the degrees of freedom in the NRext filters scale with the number of loudspeakers, which is not the case for the NR filters, resulting in an improved NR performance.

Cascaded noise reduction and acoustic echo cancellation based on an extended noise reduction

TL;DR

The paper tackles the challenge of removing both noise and echo in multichannel speech capture by introducing an extended noise reduction stage (NRext) that precedes acoustic echo cancellation (AEC) under the assumption of additive echo paths. The NRext-AEC design preserves the additivity of echo paths, enabling the AEC filters to focus solely on echo paths while NRext suppresses both near-end and far-end noise components. The key contribution is the theoretical and empirical demonstration that NRext-AEC yields AEC independence from NRext and that NRext benefits from degrees of freedom that scale with the number of loudspeakers, improving both NR and AEC performance. The approach is validated via simulations with multichannel setups and a variety of acoustic scenarios, and code is made available to support replication and further research. This work offers a practical pathway to more effective speech enhancement in hands-free and speakerphone applications where multiple loudspeakers are present.

Abstract

In many speech recording applications, the recorded desired speech is corrupted by both noise and acoustic echo, such that combined noise reduction (NR) and acoustic echo cancellation (AEC) is called for. A common cascaded design corresponds to NR filters preceding AEC filters. These NR filters aim at reducing the near-end room noise (and possibly partially the echo) and operate on the microphones only, consequently requiring the AEC filters to model both the echo paths and the NR filters. In this paper, however, we propose a design with extended NR (NRext) filters preceding AEC filters under the assumption of the echo paths being additive maps, thus preserving the addition operation. Here, the NRext filters aim at reducing both the near-end room noise and the far-end room noise component in the echo, and operate on both the microphones and loudspeakers. We show that the succeeding AEC filters remarkably become independent of the NRext filters, such that the AEC filters are only required to model the echo paths, improving the AEC performance. Further, the degrees of freedom in the NRext filters scale with the number of loudspeakers, which is not the case for the NR filters, resulting in an improved NR performance.
Paper Structure (11 sections, 10 equations, 4 figures)

This paper contains 11 sections, 10 equations, 4 figures.

Figures (4)

  • Figure 1: (a) The NR filters aim at reducing the near-end room noise (and possibly partially the echo), and the AEC filters aim at reducing the echo. (b) The NRext filters aim at reducing both the near-end room noise and far-end room noise component in the echo, and the AEC filters aim at reducing the far-end room speech (and residual noise) component in the echo.
  • Figure 2: NR performance using the converged filters for the entire data. Dots show mean performance and shading the standard deviation. At low SERin, NRext-AEC has better NR performance as the NRext filters scale with the number of loudspeakers opposed to the NR filters. Only at high SERin and low SNRin, NR-AEC has better NR performance as the NR filters use a lower rank-approximation than the NRext filters, limiting the noise from each mode.
  • Figure 3: AEC performance in function of $L_{\hat{F}}$ using the converged filters for the entire data. Dots show mean performance and shading the standard deviation. As the AEC filters in NRext-AEC are independent of the NRext filters, as opposed to NR-AEC, NRext-AEC performance exceeds NR-AEC. Only at high SERin and low SNRin the top performance is higher in NR-AEC, yet this advantage is lost with adaptive filters (Fig. \ref{['fig:seri_adaptive']}).
  • Figure 4: AEC performance in function of $L_{\hat{F}}$, when adapting the filters through time. NR-AEC shows decreased performance compared to Fig. \ref{['fig:seri']} as the AEC filters in NR-AEC need to track the adaptivity of the NR filters.