DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared Imaging
Zhu Liu, Zijun Wang, Jinyuan Liu, Fanqi Meng, Long Ma, Risheng Liu
TL;DR
Infrared imaging suffers from dynamic, complex degradations and data scarcity. The authors propose a data-efficient adversarial framework that jointly learns a degradation generator ${\mathcal{N}}_\mathtt{G}$ and an enhancer ${\mathcal{N}}_\mathtt{E}$ under a hierarchical mini-max objective, enabling adaptive degradation and robust restoration with $\hat{\mathbf{x}}=\mathcal{N}_\mathtt{G}(\mathbf{x}; \bm{\theta}^{*})$, $\bm{\theta}^{*}=\arg\max_{\bm{\theta}} \mathcal{L}(\mathcal{N}_\mathtt{E}(\mathcal{N}_\mathtt{G}(\mathbf{x}; \bm{\theta}); \bm{\omega}), \mathbf{y})$. The dual-interaction network, comprising a Scale Transform Module and a Spiking-guided Separation Module, captures degraded features with compact parameters. The method uses only 50 training images to achieve state-of-the-art restoration across single and composited degradations and improves downstream tasks such as depth estimation and object detection, demonstrating strong data efficiency and practical impact in infrared imaging.
Abstract
Thermal imaging is often compromised by dynamic, complex degradations caused by hardware limitations and unpredictable environmental factors. The scarcity of high-quality infrared data, coupled with the challenges of dynamic, intricate degradations, makes it difficult to recover details using existing methods. In this paper, we introduce thermal degradation simulation integrated into the training process via a mini-max optimization, by modeling these degraded factors as adversarial attacks on thermal images. The simulation is dynamic to maximize objective functions, thus capturing a broad spectrum of degraded data distributions. This approach enables training with limited data, thereby improving model performance.Additionally, we introduce a dual-interaction network that combines the benefits of spiking neural networks with scale transformation to capture degraded features with sharp spike signal intensities. This architecture ensures compact model parameters while preserving efficient feature representation. Extensive experiments demonstrate that our method not only achieves superior visual quality under diverse single and composited degradation, but also delivers a significant reduction in processing when trained on only fifty clear images, outperforming existing techniques in efficiency and accuracy. The source code will be available at https://github.com/LiuZhu-CV/DEAL.
