Consistency Trajectory Matching for One-Step Generative Super-Resolution
Weiyi You, Mingyang Zhang, Leheng Zhang, Xingyu Zhou, Kexuan Shi, Shuhang Gu
TL;DR
This work introduces Consistency Trajectory Matching for Super-Resolution (CTMSR), a distillation-free framework that enables one-step, photo-realistic SR by learning a PF-ODE trajectory from noisy LR to HR through Consistency Training (CT) and refining realism with Distribution Trajectory Matching (DTM). CT directly maps trajectory points along the PF-ODE to the final HR, avoiding pre-trained diffusion teachers, while DTM aligns the SR trajectory with the natural-image distribution at the distribution level. The method achieves competitive or superior perceptual quality compared to diffusion-based baselines on both synthetic and real-world datasets, with markedly lower inference latency. CTMSR thus provides a scalable, backbone-independent, and efficient solution for high-quality one-step super-resolution.
Abstract
Current diffusion-based super-resolution (SR) approaches achieve commendable performance at the cost of high inference overhead. Therefore, distillation techniques are utilized to accelerate the multi-step teacher model into one-step student model. Nevertheless, these methods significantly raise training costs and constrain the performance of the student model by the teacher model. To overcome these tough challenges, we propose Consistency Trajectory Matching for Super-Resolution (CTMSR), a distillation-free strategy that is able to generate photo-realistic SR results in one step. Concretely, we first formulate a Probability Flow Ordinary Differential Equation (PF-ODE) trajectory to establish a deterministic mapping from low-resolution (LR) images with noise to high-resolution (HR) images. Then we apply the Consistency Training (CT) strategy to directly learn the mapping in one step, eliminating the necessity of pre-trained diffusion model. To further enhance the performance and better leverage the ground-truth during the training process, we aim to align the distribution of SR results more closely with that of the natural images. To this end, we propose to minimize the discrepancy between their respective PF-ODE trajectories from the LR image distribution by our meticulously designed Distribution Trajectory Matching (DTM) loss, resulting in improved realism of our recovered HR images. Comprehensive experimental results demonstrate that the proposed methods can attain comparable or even superior capabilities on both synthetic and real datasets while maintaining minimal inference latency.
