Table of Contents
Fetching ...

Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport

Bernardo Torres, Alain Riou, Gaël Richard, Geoffroy Peeters

TL;DR

The paper addresses pitch estimation under limited labeled data by introducing a translation-equivariant self-supervised objective based on Optimal Transport. It replaces prior equivariance losses with a 1D OT-based loss that measures the distance between pitch distributions of original and pitch-shifted frames in the log-frequency domain, integrated into the PESTO framework. Results show competitive performance against the state-of-the-art, particularly on MIR-1K, and highlight the numerical stability and simplicity of the OT loss. The work suggests broad applicability of OT-based translation equivariance to other MIR tasks and points to future variants like Circular OT for key or chord estimation.

Abstract

In this paper, we propose an Optimal Transport objective for learning one-dimensional translation-equivariant systems and demonstrate its applicability to single pitch estimation. Our method provides a theoretically grounded, more numerically stable, and simpler alternative for training state-of-the-art self-supervised pitch estimators.

Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport

TL;DR

The paper addresses pitch estimation under limited labeled data by introducing a translation-equivariant self-supervised objective based on Optimal Transport. It replaces prior equivariance losses with a 1D OT-based loss that measures the distance between pitch distributions of original and pitch-shifted frames in the log-frequency domain, integrated into the PESTO framework. Results show competitive performance against the state-of-the-art, particularly on MIR-1K, and highlight the numerical stability and simplicity of the OT loss. The work suggests broad applicability of OT-based translation equivariance to other MIR tasks and points to future variants like Circular OT for key or chord estimation.

Abstract

In this paper, we propose an Optimal Transport objective for learning one-dimensional translation-equivariant systems and demonstrate its applicability to single pitch estimation. Our method provides a theoretically grounded, more numerically stable, and simpler alternative for training state-of-the-art self-supervised pitch estimators.

Paper Structure

This paper contains 6 sections, 4 equations, 1 figure, 1 table.

Figures (1)

  • Figure :