Bidirectional Consistency Models
Liangchen Li, Jiajun He
TL;DR
BCM addresses the bottleneck of slow generation and challenging inversion in diffusion models by learning a single network that can traverse the PF ODE bidirectionally, unifying generation and inversion under a shared trajectory-centric objective. Through Bidirectional Consistency Training and a specialized network parameterization, BCM supports efficient one-step generation and inversion and enables powerful downstream tasks such as interpolation and inpainting. The method demonstrates competitive generation quality with far fewer NFEs and offers flexible sampling strategies (ancestral and zigzag) that leverage bidirectional traversal; it also enables inversion-driven applications like real-to-real image interpolation and blind restoration. While presenting strong gains and versatile capabilities, BCM acknowledges limits in inversion fidelity and diminishing returns with excessive NFEs, suggesting directions for improved inversion and task-specific fine-tuning.
Abstract
Diffusion models (DMs) are capable of generating remarkably high-quality samples by iteratively denoising a random vector, a process that corresponds to moving along the probability flow ordinary differential equation (PF ODE). Interestingly, DMs can also invert an input image to noise by moving backward along the PF ODE, a key operation for downstream tasks such as interpolation and image editing. However, the iterative nature of this process restricts its speed, hindering its broader application. Recently, Consistency Models (CMs) have emerged to address this challenge by approximating the integral of the PF ODE, largely reducing the number of iterations. Yet, the absence of an explicit ODE solver complicates the inversion process. To resolve this, we introduce Bidirectional Consistency Model (BCM), which learns a single neural network that enables both forward and backward traversal along the PF ODE, efficiently unifying generation and inversion tasks within one framework. We can train BCM from scratch or tune it using a pretrained consistency model, which reduces the training cost and increases scalability. We demonstrate that BCM enables one-step generation and inversion while also allowing the use of additional steps to enhance generation quality or reduce reconstruction error. We further showcase BCM's capability in downstream tasks, such as interpolation and inpainting. Our code and weights are available at https://github.com/Mosasaur5526/BCM-iCT-torch.
