Bridge-SR: Schrödinger Bridge for Efficient SR

Chang Li; Zehua Chen; Fan Bao; Jun Zhu

Bridge-SR: Schrödinger Bridge for Efficient SR

Chang Li, Zehua Chen, Fan Bao, Jun Zhu

TL;DR

Bridge-SR tackles the problem of generating high-fidelity 48kHz speech waveforms from lower-sampled inputs by leveraging a tractable Schrödinger bridge that treats the low-resolution waveform as a prior. By shifting from a noise-to-data diffusion path to a data-to-data trajectory, the method preserves informative low-frequency content while efficiently guiding high-frequency reconstruction. Key contributions include an asymmetric noise schedule (notably Bridge-$g_ ext{max}$), a data-scaling strategy to emphasize high-frequency details, and auxiliary losses (multi-scale STFT magnitude and phase) that boost SR performance. On the VCTK dataset, Bridge-SR delivers superior synthesis quality with a lightweight backbone (≈1.7M parameters) and faster inference than diffusion-based counterparts, demonstrating practical efficiency gains for waveform-domain SR.

Abstract

Speech super-resolution (SR), which generates a waveform at a higher sampling rate from its low-resolution version, is a long-standing critical task in speech restoration. Previous works have explored speech SR in different data spaces, but these methods either require additional compression networks or exhibit limited synthesis quality and inference speed. Motivated by recent advances in probabilistic generative models, we present Bridge-SR, a novel and efficient any-to-48kHz SR system in the speech waveform domain. Using tractable Schrödinger Bridge models, we leverage the observed low-resolution waveform as a prior, which is intrinsically informative for the high-resolution target. By optimizing a lightweight network to learn the score functions from the prior to the target, we achieve efficient waveform SR through a data-to-data generation process that fully exploits the instructive content contained in the low-resolution observation. Furthermore, we identify the importance of the noise schedule, data scaling, and auxiliary loss functions, which further improve the SR quality of bridge-based systems. The experiments conducted on the benchmark dataset VCTK demonstrate the efficiency of our system: (1) in terms of sample quality, Bridge-SR outperforms several strong baseline methods under different SR settings, using a lightweight network backbone (1.7M); (2) in terms of inference speed, our 4-step synthesis achieves better performance than the 8-step conditional diffusion counterpart (LSD: 0.911 vs 0.927). Demo at https://bridge-sr.github.io.

Bridge-SR: Schrödinger Bridge for Efficient SR

TL;DR

), a data-scaling strategy to emphasize high-frequency details, and auxiliary losses (multi-scale STFT magnitude and phase) that boost SR performance. On the VCTK dataset, Bridge-SR delivers superior synthesis quality with a lightweight backbone (≈1.7M parameters) and faster inference than diffusion-based counterparts, demonstrating practical efficiency gains for waveform-domain SR.

Abstract

Paper Structure (13 sections, 7 equations, 2 figures, 2 tables)

This paper contains 13 sections, 7 equations, 2 figures, 2 tables.

Introduction
Bridge-SR
Tractable Schrödinger bridge
Noise scheduling and data scaling
Auxiliary losses
Experiments
Experimental Setup
Baseline and Evaluation
Inference Schedule
results
Results Analysis
Ablation studies
Conclusion

Figures (2)

Figure 1: Overview of Bridge-SR. As shown in the upper part, the forward process of the Schrödinger bridge simulates low-pass filters (LPF). The lower part shows the intermediate representations in both the waveform and spectral domains under our asymmetric noise schedule.
Figure 2: We show the means of the intermediate representations for the diffusion process and the bridge process with no linear drift, respectively. It becomes evident that, for the diffusion process, the low-frequency components gradually vanish during the forward SDE. In contrast, our bridge process preserves the low-frequency components.

Bridge-SR: Schrödinger Bridge for Efficient SR

TL;DR

Abstract

Bridge-SR: Schrödinger Bridge for Efficient SR

Authors

TL;DR

Abstract

Table of Contents

Figures (2)