Table of Contents
Fetching ...

Distribution Preserving Source Separation With Time Frequency Predictive Models

Pedro J. Villasana T., Janusz Klejsa, Lars Villemoes, Per Hedelin

TL;DR

This work provides an example of a distribution preserving source separation method, which aims at addressing perceptual shortcomings of state-of-the-art methods by means of mix-consistent sampling from a distribution conditioned on a realization of a mix.

Abstract

We provide an example of a distribution preserving source separation method, which aims at addressing perceptual shortcomings of state-of-the-art methods. Our approach uses unconditioned generative models of signal sources. Reconstruction is achieved by means of mix-consistent sampling from a distribution conditioned on a realization of a mix. The separated signals follow their respective source distributions, which provides an advantage when separation results are evaluated in a listening test.

Distribution Preserving Source Separation With Time Frequency Predictive Models

TL;DR

This work provides an example of a distribution preserving source separation method, which aims at addressing perceptual shortcomings of state-of-the-art methods by means of mix-consistent sampling from a distribution conditioned on a realization of a mix.

Abstract

We provide an example of a distribution preserving source separation method, which aims at addressing perceptual shortcomings of state-of-the-art methods. Our approach uses unconditioned generative models of signal sources. Reconstruction is achieved by means of mix-consistent sampling from a distribution conditioned on a realization of a mix. The separated signals follow their respective source distributions, which provides an advantage when separation results are evaluated in a listening test.
Paper Structure (12 sections, 7 equations, 4 figures, 1 table)

This paper contains 12 sections, 7 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Spectrograms: (left) Input mixture (speech+piano); (center) Separated piano (DPSS); (right) Separated speech (DPSS).
  • Figure 2: Source model operation.
  • Figure 3: Listening test results for Supra piano (11 listeners, 95% confidence intervals, Student's t-distribution).
  • Figure 4: Listening test results for VCTK speech (11 listeners, 95% confidence intervals, Student's t-distribution).