Table of Contents
Fetching ...

U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening

Sungpyo Kim, Jeonghyeok Do, Jaehyup Lee, Munchurl Kim

TL;DR

U-Know-DiffPAN tackles PAN-sharpening by uniting diffusion-based restoration with uncertainty-aware knowledge distillation. A high-capacity teacher (FSA-T) leveraging frequency-selective attention and Wavelet-based conditioning generates high-frequency details and a spatial uncertainty map, while a lightweight student (FSA-S) learns to reproduce these details guided by the uncertainty map, reducing computation. The encoder consumes compact PAN/LRMS representations and the decoder uses SWT-based conditioning to maximize frequency information usage. Across WV3, QB, and GF2 datasets, the approach achieves state-of-the-art performance and demonstrates robust behavior in high-uncertainty regions, though inference speed remains a remaining challenge due to diffusion steps. Overall, the method offers a practical path to high-quality HRMS outputs with improved efficiency and targeted detail restoration for challenging satellite imagery.

Abstract

Conventional methods for PAN-sharpening often struggle to restore fine details due to limitations in leveraging high-frequency information. Moreover, diffusion-based approaches lack sufficient conditioning to fully utilize Panchromatic (PAN) images and low-resolution multispectral (LRMS) inputs effectively. To address these challenges, we propose an uncertainty-aware knowledge distillation diffusion framework with details enhancement for PAN-sharpening, called U-Know-DiffPAN. The U-Know-DiffPAN incorporates uncertainty-aware knowledge distillation for effective transfer of feature details from our teacher model to a student one. The teacher model in our U-Know-DiffPAN captures frequency details through freqeuncy selective attention, facilitating accurate reverse process learning. By conditioning the encoder on compact vector representations of PAN and LRMS and the decoder on Wavelet transforms, we enable rich frequency utilization. So, the high-capacity teacher model distills frequency-rich features into a lightweight student model aided by an uncertainty map. From this, the teacher model can guide the student model to focus on difficult image regions for PAN-sharpening via the usage of the uncertainty map. Extensive experiments on diverse datasets demonstrate the robustness and superior performance of our U-Know-DiffPAN over very recent state-of-the-art PAN-sharpening methods.

U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening

TL;DR

U-Know-DiffPAN tackles PAN-sharpening by uniting diffusion-based restoration with uncertainty-aware knowledge distillation. A high-capacity teacher (FSA-T) leveraging frequency-selective attention and Wavelet-based conditioning generates high-frequency details and a spatial uncertainty map, while a lightweight student (FSA-S) learns to reproduce these details guided by the uncertainty map, reducing computation. The encoder consumes compact PAN/LRMS representations and the decoder uses SWT-based conditioning to maximize frequency information usage. Across WV3, QB, and GF2 datasets, the approach achieves state-of-the-art performance and demonstrates robust behavior in high-uncertainty regions, though inference speed remains a remaining challenge due to diffusion steps. Overall, the method offers a practical path to high-quality HRMS outputs with improved efficiency and targeted detail restoration for challenging satellite imagery.

Abstract

Conventional methods for PAN-sharpening often struggle to restore fine details due to limitations in leveraging high-frequency information. Moreover, diffusion-based approaches lack sufficient conditioning to fully utilize Panchromatic (PAN) images and low-resolution multispectral (LRMS) inputs effectively. To address these challenges, we propose an uncertainty-aware knowledge distillation diffusion framework with details enhancement for PAN-sharpening, called U-Know-DiffPAN. The U-Know-DiffPAN incorporates uncertainty-aware knowledge distillation for effective transfer of feature details from our teacher model to a student one. The teacher model in our U-Know-DiffPAN captures frequency details through freqeuncy selective attention, facilitating accurate reverse process learning. By conditioning the encoder on compact vector representations of PAN and LRMS and the decoder on Wavelet transforms, we enable rich frequency utilization. So, the high-capacity teacher model distills frequency-rich features into a lightweight student model aided by an uncertainty map. From this, the teacher model can guide the student model to focus on difficult image regions for PAN-sharpening via the usage of the uncertainty map. Extensive experiments on diverse datasets demonstrate the robustness and superior performance of our U-Know-DiffPAN over very recent state-of-the-art PAN-sharpening methods.

Paper Structure

This paper contains 24 sections, 27 equations, 15 figures, 8 tables.

Figures (15)

  • Figure 1: Visual comparison of PAN-sharpening results on the full-resolution WV3 dataset. The rightmost image shows the output of our proposed U-Know-DiffPAN framework, specifically the result produced by FSA-S (frequency selective attention student network). Notably, the proposed framework generates more detailed and robust results, particularly in high-uncertainty regions, outperforming the state-of-the-art model CANConv duan2024content and recent diffusion-based methods PanDiff meng2023pandiff and TMDiff xing2024empower. As highlighted in the red box, our approach successfully restores challenging highly-uncertain regions, such as cars, where other models fall short.
  • Figure 2: Overview of our uncertainty-aware knowledge-distillation diffusion framework with details enhancement, called U-Know-DiffPAN.
  • Figure 3: Architecture of our proposed teacher model with frequency-selective attention, denoted as FSA-T, for PAN-sharpening. The FSA-T is designed to fully utilize frequency information for details enhancement from PAN and LRMS inputs (more details in Supplemental.)
  • Figure 4: Visual comparison of PAN-sharpening results on the reduced-resolution GF2 dataset. The first row shows RGB of outputs $\textbf{I}^\text{HR}_\text{MS}$, and the second row displays Error Map, the difference between output $\hat{\textbf{I}}^\text{HR}_\text{MS}$ and ground truth $\textbf{I}^\text{HR}_\text{MS}$. Both FSA-T and FSA-S achieve more detailed results compared to state-of-the-art models.
  • Figure 5: Visualization of uncertainty map $\hat{\bm\theta}$, Error Map, and ground truth $\textbf{I}_\text{MS}^\text{HR}$ of reduced WV3.
  • ...and 10 more figures