RestoRect: Degraded Image Restoration via Latent Rectified Flow & Feature Distillation
Shourya Verma, Mengbo Wang, Nadia Atallah Lanman, Ananth Grama
TL;DR
RestoRect tackles degraded image restoration by bridging accuracy and efficiency through a generative knowledge-distillation framework based on latent rectified flow. The teacher–student pair leverages Retinex priors, Spatial Channel Layer Normalization, and a FLEX loss to align multi-scale transformer representations across heterogeneous architectures, enabling robust feature synthesis with few steps. The student learns velocity predictors for reflectance and image features under a two-phase training protocol, achieving diffusion-like restoration with as few as 4 inference steps and demonstrating state-of-the-art performance across 15 datasets and 4 tasks. This approach offers a scalable pathway for fast, high-fidelity image restoration and establishes a generalizable paradigm for cross-architecture knowledge transfer in vision models.
Abstract
Current approaches for restoration of degraded images face a trade-off: high-performance models are slow for practical use, while fast models produce poor results. Knowledge distillation transfers teacher knowledge to students, but existing static feature matching methods cannot capture how modern transformer architectures dynamically generate features. We propose a novel Latent Rectified Flow Feature Distillation method for restoring degraded images called \textbf{'RestoRect'}. We apply rectified flow to reformulate feature distillation as a generative process where students learn to synthesize teacher-quality features through learnable trajectories in latent space. Our framework combines Retinex decomposition with learnable anisotropic diffusion constraints, and trigonometric color space polarization. We introduce a Feature Layer Extraction loss for robust knowledge transfer between different network architectures through cross-normalized transformer feature alignment with percentile-based outlier detection. RestoRect achieves better training stability, and faster convergence and inference while preserving restoration quality, demonstrating superior results across 15 image restoration datasets, covering 4 tasks, on 10 metrics against baselines.
