Table of Contents
Fetching ...

Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables

Zhongnan Cai, Yingying Wang, Hui Zheng, Panwang Pan, ZiXu Lin, Ge Meng, Chenxin Li, Chunming He, Jiaxin Xie, Yunlong Lin, Junbin Lu, Yue Huang, Xinghao Ding

TL;DR

Pan-LUT introduces a learnable LUT-based pan-sharpening framework that replaces deep neural networks with three specialized LUTs to balance spectral fidelity, spatial detail, and adaptive channel fusion. The PAN-guided LUT, Spatial Details LUT, and Adaptive Output LUT enable efficient processing of very large remote-sensing images while delivering competitive results against traditional and some DL methods, with near real-time inference. Extensive experiments across multiple satellite datasets and ablations validate the contributions of each LUT and the effectiveness of the regularization strategy. This approach bridges high-quality pan-sharpening with practical deployment on resource-constrained platforms, enabling large-scale remote sensing applications.

Abstract

Recently, deep learning-based pan-sharpening algorithms have achieved notable advancements over traditional methods. However, deep learning-based methods incur substantial computational overhead during inference, especially with large images. This excessive computational demand limits the applicability of these methods in real-world scenarios, particularly in the absence of dedicated computing devices such as GPUs and TPUs. To address these challenges, we propose Pan-LUT, a novel learnable look-up table (LUT) framework for pan-sharpening that strikes a balance between performance and computational efficiency for large remote sensing images. Our method makes it possible to process 15K*15K remote sensing images on a 24GB GPU. To finely control the spectral transformation, we devise the PAN-guided look-up table (PGLUT) for channel-wise spectral mapping. To effectively capture fine-grained spatial details, we introduce the spatial details look-up table (SDLUT). Furthermore, to adaptively aggregate channel information for generating high-resolution multispectral images, we design an adaptive output look-up table (AOLUT). Our model contains fewer than 700K parameters and processes a 9K*9K image in under 1 ms using one RTX 2080 Ti GPU, demonstrating significantly faster performance compared to other methods. Experiments reveal that Pan-LUT efficiently processes large remote sensing images in a lightweight manner, bridging the gap to real-world applications. Furthermore, our model surpasses SOTA methods in full-resolution scenes under real-world conditions, highlighting its effectiveness and efficiency.

Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables

TL;DR

Pan-LUT introduces a learnable LUT-based pan-sharpening framework that replaces deep neural networks with three specialized LUTs to balance spectral fidelity, spatial detail, and adaptive channel fusion. The PAN-guided LUT, Spatial Details LUT, and Adaptive Output LUT enable efficient processing of very large remote-sensing images while delivering competitive results against traditional and some DL methods, with near real-time inference. Extensive experiments across multiple satellite datasets and ablations validate the contributions of each LUT and the effectiveness of the regularization strategy. This approach bridges high-quality pan-sharpening with practical deployment on resource-constrained platforms, enabling large-scale remote sensing applications.

Abstract

Recently, deep learning-based pan-sharpening algorithms have achieved notable advancements over traditional methods. However, deep learning-based methods incur substantial computational overhead during inference, especially with large images. This excessive computational demand limits the applicability of these methods in real-world scenarios, particularly in the absence of dedicated computing devices such as GPUs and TPUs. To address these challenges, we propose Pan-LUT, a novel learnable look-up table (LUT) framework for pan-sharpening that strikes a balance between performance and computational efficiency for large remote sensing images. Our method makes it possible to process 15K*15K remote sensing images on a 24GB GPU. To finely control the spectral transformation, we devise the PAN-guided look-up table (PGLUT) for channel-wise spectral mapping. To effectively capture fine-grained spatial details, we introduce the spatial details look-up table (SDLUT). Furthermore, to adaptively aggregate channel information for generating high-resolution multispectral images, we design an adaptive output look-up table (AOLUT). Our model contains fewer than 700K parameters and processes a 9K*9K image in under 1 ms using one RTX 2080 Ti GPU, demonstrating significantly faster performance compared to other methods. Experiments reveal that Pan-LUT efficiently processes large remote sensing images in a lightweight manner, bridging the gap to real-world applications. Furthermore, our model surpasses SOTA methods in full-resolution scenes under real-world conditions, highlighting its effectiveness and efficiency.

Paper Structure

This paper contains 26 sections, 38 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Comparisons of computational efficiency. Our method can process 9K$\times$9K and 15K$\times$15K images on GPUs with 11GB and 24GB memory, respectively. Meanwhile, we observe that (a) DNN-based methods are highly sensitive to the image size, and (b) in the absence of a GPU, they require a considerable amount of time to process images. In the CPU inference time experiments, all methods were conducted on a workstation equipped with an Intel(R) Xeon(R) Gold 6226R CPU.
  • Figure 2: The overall framework of our proposed Pan-LUT. PGLUT is a spectral transformation LUT designed to extract spectral information, SDLUT is a spatial detail transformation LUT for capturing texture features, and AOLUT is an adaptive output LUT used to aggregate channel information.
  • Figure 3: Visual comparison on WorldView-III dataset. The last row visualizes the MSE residues between the pan-sharpening results and the ground truth.
  • Figure 4: Visual comparison on the real full-resolution scenes from the WorldView-II dataset. For a more detailed examination of the results, we zoomed-in view on specific parts of the images.
  • Figure 5: Ablation studies on different sizes of the PGLUT, SDLUT and AOLUT on the WorldView-III dataset.
  • ...and 4 more figures