Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation

Yuto Ishikawa; Kohei Konaka; Tomohiko Nakamura; Norihiro Takamune; Hiroshi Saruwatari

Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation

Yuto Ishikawa, Kohei Konaka, Tomohiko Nakamura, Norihiro Takamune, Hiroshi Saruwatari

TL;DR

This paper proposes the real-time extension of a speech extraction method based on independent low-rank matrix analysis (ILRMA) and rank-constrained spatial covariance matrix estimation (RCSCME).

Abstract

Real-time speech extraction is an important challenge with various applications such as speech recognition in a human-like avatar/robot. In this paper, we propose the real-time extension of a speech extraction method based on independent low-rank matrix analysis (ILRMA) and rank-constrained spatial covariance matrix estimation (RCSCME). The RCSCME-based method is a multichannel blind speech extraction method that demonstrates superior speech extraction performance in diffuse noise environments. To improve the performance, we introduce spatial regularization into the ILRMA part of the RCSCME-based speech extraction and design two regularizers. Speech extraction experiments demonstrated that the proposed methods can function in real time and the designed regularizers improve the speech extraction performance.

Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation

TL;DR

This paper proposes the real-time extension of a speech extraction method based on independent low-rank matrix analysis (ILRMA) and rank-constrained spatial covariance matrix estimation (RCSCME).

Abstract

Paper Structure (13 sections, 7 equations, 3 figures)

This paper contains 13 sections, 7 equations, 3 figures.

Introduction
Related Offline methods
ILRMA Kitamura2016TASLP
RCSCME Kubo2020TASLP
Proposed method
Real-time RCSCME-based speech extraction method
Incorporation of prior target speech direction information
Spatial regularization using prior target steering vector
ILRMA with null-based spatial regularization
Experiments
Experimental conditions
Experimental results
Conclusion

Figures (3)

Figure 1: Schematic of parallel processing in real-time RCSCME-based speech extraction method.
Figure 2: Boxplots of processing time for ILRMA part of all methods.
Figure 3: Boxplots of SDR improvement for all methods.

Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation

TL;DR

Abstract

Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)