Table of Contents
Fetching ...

Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark

Jinyuan Liu, Zihang Chen, Zhu Liu, Zhiying Jiang, Long Ma, Xin Fan, Risheng Liu

TL;DR

The paper tackles the challenge of enhancing thermal infrared images under coupled degradations by introducing the Progressive Prompt Fusion Network (PPFN), which uses degradation-specific and type-specific prompts to adaptively modulate features, and a Selective Progressive Training (SPT) strategy to handle single and composite degradations in a staged fashion. A plug-in fusion mechanism combines prompt features to generate channel-wise modulation parameters that steer the restoration model, enabling effective restoration across diverse conditions. The authors also present HM-TIR, a high-quality multi-scenario TIR benchmark with 1,503 images, and demonstrate substantial improvements over state-of-the-art methods on both multi- and single-degradation tasks, including an $8.76\%$ PSNR gain on the Normal Set. Collectively, the work advances practical infrared image enhancement by integrating prompt-based conditioning, progressive degradation handling, and a high-fidelity dataset for robust evaluation, with broad implications for surveillance, search-and-rescue, and autonomous systems.

Abstract

We engage in the relatively underexplored task named thermal infrared image enhancement. Existing infrared image enhancement methods primarily focus on tackling individual degradations, such as noise, contrast, and blurring, making it difficult to handle coupled degradations. Meanwhile, all-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness due to the significant differences in imaging models. In sight of this, we first revisit the imaging mechanism and introduce a Progressive Prompt Fusion Network (PPFN). Specifically, the PPFN initially establishes prompt pairs based on the thermal imaging process. For each type of degradation, we fuse the corresponding prompt pairs to modulate the model's features, providing adaptive guidance that enables the model to better address specific degradations under single or multiple conditions. In addition, a Selective Progressive Training (SPT) mechanism is introduced to gradually refine the model's handling of composite cases to align the enhancement process, which not only allows the model to remove camera noise and retain key structural details, but also enhancing the overall contrast of the thermal image. Furthermore, we introduce the most high-quality, multi-scenarios infrared benchmark covering a wide range of scenarios. Extensive experiments substantiate that our approach not only delivers promising visual results under specific degradation but also significantly improves performance on complex degradation scenes, achieving a notable 8.76\% improvement. Code is available at https://github.com/Zihang-Chen/HM-TIR.

Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark

TL;DR

The paper tackles the challenge of enhancing thermal infrared images under coupled degradations by introducing the Progressive Prompt Fusion Network (PPFN), which uses degradation-specific and type-specific prompts to adaptively modulate features, and a Selective Progressive Training (SPT) strategy to handle single and composite degradations in a staged fashion. A plug-in fusion mechanism combines prompt features to generate channel-wise modulation parameters that steer the restoration model, enabling effective restoration across diverse conditions. The authors also present HM-TIR, a high-quality multi-scenario TIR benchmark with 1,503 images, and demonstrate substantial improvements over state-of-the-art methods on both multi- and single-degradation tasks, including an PSNR gain on the Normal Set. Collectively, the work advances practical infrared image enhancement by integrating prompt-based conditioning, progressive degradation handling, and a high-fidelity dataset for robust evaluation, with broad implications for surveillance, search-and-rescue, and autonomous systems.

Abstract

We engage in the relatively underexplored task named thermal infrared image enhancement. Existing infrared image enhancement methods primarily focus on tackling individual degradations, such as noise, contrast, and blurring, making it difficult to handle coupled degradations. Meanwhile, all-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness due to the significant differences in imaging models. In sight of this, we first revisit the imaging mechanism and introduce a Progressive Prompt Fusion Network (PPFN). Specifically, the PPFN initially establishes prompt pairs based on the thermal imaging process. For each type of degradation, we fuse the corresponding prompt pairs to modulate the model's features, providing adaptive guidance that enables the model to better address specific degradations under single or multiple conditions. In addition, a Selective Progressive Training (SPT) mechanism is introduced to gradually refine the model's handling of composite cases to align the enhancement process, which not only allows the model to remove camera noise and retain key structural details, but also enhancing the overall contrast of the thermal image. Furthermore, we introduce the most high-quality, multi-scenarios infrared benchmark covering a wide range of scenarios. Extensive experiments substantiate that our approach not only delivers promising visual results under specific degradation but also significantly improves performance on complex degradation scenes, achieving a notable 8.76\% improvement. Code is available at https://github.com/Zihang-Chen/HM-TIR.

Paper Structure

This paper contains 15 sections, 7 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: An illustration of the thermal infrared degradation pipeline. Thermal infrared imaging is prone to degradation from external factors such as solar radiation, atmospheric scattering, and turbulence, as well as internal factors like pixel size, internal noise, and jitter.
  • Figure 2: Schematic diagram of the proposed TIR enhancement framework. In subfigure (a), we first illustrate the TIR degradation process, including low contrast, blur, and noise across single and composite degradation scenarios. Subfigures (b) and (c) present details of the PPFN integrated with the image restoration model. Lastly, we depict our SPT in subfigure (d).
  • Figure 3: Example images from our HM-TIR benchmark include: (a) skyscraper, (b) seaside, (c) mountain, (d) cross-sea bridge, (e) pendulum, (f) tower, (g) camping area, (h) commercial street, (i) mansion, (j) square, (k) Ferris wheel, (l) boats, and (m) tourist attraction.
  • Figure 4: Quantitative and qualitative comparisons of signal performance across competitive image enhancement methods and our proposed approach. The average PSNR and SSIM values in our Normal Set are provided below the comparison figures in blue.
  • Figure 4: Ablation studies on the PPFN and SPT strategy. The best is in red, and the second-best is in blue.
  • ...and 7 more figures