LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation

Kazuki Matsumoto; Ren Uchida; Kohei Yatabe

LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation

Kazuki Matsumoto, Ren Uchida, Kohei Yatabe

Abstract

The robustness of deep neural networks (DNNs) can be certified through their Lipschitz continuity, which has made the construction of Lipschitz-continuous DNNs an active research field. However, DNNs for audio processing have not been a major focus due to their poor compatibility with existing results. In this paper, we consider the amplitude modifier (AM), a popular architecture for handling audio signals, and propose its Lipschitz-continuous variants, which we refer to as LipsAM. We prove a sufficient condition for an AM to be Lipschitz continuous and propose two architectures as examples of LipsAM. The proposed architectures were applied to a Plug-and-Play algorithm for speech dereverberation, and their improved stability is demonstrated through numerical experiments.

LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation

Abstract

Paper Structure (14 sections, 2 theorems, 11 equations, 4 figures, 1 table)

This paper contains 14 sections, 2 theorems, 11 equations, 4 figures, 1 table.

Introduction
Preliminary
Lipschitz Continuity of DNNs
Amplitude Modifier (AM)
Proposed Method
AMs are not Lipschitz Continuous
LipsAM: Lipschitz-continuous Amplitude Modifier
Proposed Architectures: LipsAM-SE and LipsAM-RE
Numerical Validation of Lipschitz Bound
Application to Plug-and-Play Speech Dereverberation
Problem Setting and Plug-and-Play Algorithm
Experimental Settings
Results
Conclusion

Key Result

Theorem 4

Let $\mathcal{D}_{\mathcal{A}}:\mathbb{C}^N\to\mathbb{C}^N:\mathbf{z}\mapsto \mathcal{A}(|\mathbf{z}|)\odot\mathrm{sign}(\mathbf{z})$, and $\mathcal{A}:\mathbb{R}_+^N\to\mathbb{R}_+^N$ satisfy Eqs. eq:cond1 and eq:cond2 in Assumption asm:assumption. Then, $\mathcal{D}_{\mathcal{A}}$ is Lipschitz con

Figures (4)

Figure 1: Amplitude-Modifying DNNs. Red layers are trainable, blue layers are introduced to enforce Lipschitz continuity. Layers "mul" represent the element-wise multiplication. While the popular architectures, AM-SE and AM-RE, are generally not Lipschitz continuous, Lipschitz constants of our LipsAM-SE and LipsAM-RE can be bounded by $\sqrt{(\mathrm{Lip}(\mathcal{S}))^2+1}$ and $\mathrm{Lip}(\mathcal{R})+1$, respectively.
Figure 2: Numerical estimate of the value $B$ in Eq. \ref{['eq:B']} for each architecture. Upper rows are that of AM-SE in Eq. \ref{['eq:signal']} and LipsAM-SE in Eq. \ref{['eq:signallim']}. Lower rows are that of AM-RE in Eq. \ref{['eq:residual']} and LipsAM-RE in Eq. \ref{['eq:residuallim']}. Dots indicate results of 100 trials, and maximum result is marked with large circle. Solid lines show the theoretical bound in Theorem \ref{['thm:signallim']}. Areas exceeding termination threshold are darkened.
Figure 3: Parameter $\lambda$ vs SI-SNR after 500 iterations. Solid colored lines represents the proposed LipsAMs, and dotted colored lines represent the conventional AMs. Use of orthogonal layers is indicated by (Ortho). Gray lines show $\ell_1$-norm-based method. Best parameters for each model are shown by the markers. A close-up view of left is on the right. Missing points indicate that algorithms diverged.
Figure 4: Transition of amount of update $\|\Delta \mathbf{x}\|_2$. In each box, results of LipsAMs with best $\lambda$ in Fig. \ref{['fig:parameters']} were compared with that of AMs with the same $\lambda$ as LipsAMs.

Theorems & Definitions (6)

Example 1: Bias
Example 2: Permutation
Theorem 4
proof
Theorem 5
proof

LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation

Abstract

LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation

Authors

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (6)