Table of Contents
Fetching ...

Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening

Zihan Cao, Yu Zhong, Liang-Jian Deng

TL;DR

This work introduces Optimal Transport Flow Matching (OTFM), which leverages a regularized unbalanced optimal transport (UOT) formulation within a flow-matching framework to enable one-step, diffusion-free pansharpening. By relaxing marginal constraints via UOT and incorporating task-specific regularization, the model achieves high-quality fusion from PAN and LRMS data with simulation-free training and a single inference step. The approach yields competitive or superior results compared with state-of-the-art diffusion-based and regression-based methods across multiple datasets, while delivering substantially lower latency. The combination of a mapping network and a potential network enables stable training and robust generalization, facilitating practical deployment for real-time remote sensing pansharpening tasks.

Abstract

Pansharpening, a pivotal task in remote sensing for fusing high-resolution panchromatic and multispectral imagery, has garnered significant research interest. Recent advancements employing diffusion models based on stochastic differential equations (SDEs) have demonstrated state-of-the-art performance. However, the inherent multi-step sampling process of SDEs imposes substantial computational overhead, hindering practical deployment. While existing methods adopt efficient samplers, knowledge distillation, or retraining to reduce sampling steps (e.g., from 1,000 to fewer steps), such approaches often compromise fusion quality. In this work, we propose the Optimal Transport Flow Matching (OTFM) framework, which integrates the dual formulation of unbalanced optimal transport (UOT) to achieve one-step, high-quality pansharpening. Unlike conventional OT formulations that enforce rigid distribution alignment, UOT relaxes marginal constraints to enhance modeling flexibility, accommodating the intrinsic spectral and spatial disparities in remote sensing data. Furthermore, we incorporate task-specific regularization into the UOT objective, enhancing the robustness of the flow model. The OTFM framework enables simulation-free training and single-step inference while maintaining strict adherence to pansharpening constraints. Experimental evaluations across multiple datasets demonstrate that OTFM matches or exceeds the performance of previous regression-based models and leading diffusion-based methods while only needing one sampling step. Codes are available at https://github.com/294coder/PAN-OTFM.

Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening

TL;DR

This work introduces Optimal Transport Flow Matching (OTFM), which leverages a regularized unbalanced optimal transport (UOT) formulation within a flow-matching framework to enable one-step, diffusion-free pansharpening. By relaxing marginal constraints via UOT and incorporating task-specific regularization, the model achieves high-quality fusion from PAN and LRMS data with simulation-free training and a single inference step. The approach yields competitive or superior results compared with state-of-the-art diffusion-based and regression-based methods across multiple datasets, while delivering substantially lower latency. The combination of a mapping network and a potential network enables stable training and robust generalization, facilitating practical deployment for real-time remote sensing pansharpening tasks.

Abstract

Pansharpening, a pivotal task in remote sensing for fusing high-resolution panchromatic and multispectral imagery, has garnered significant research interest. Recent advancements employing diffusion models based on stochastic differential equations (SDEs) have demonstrated state-of-the-art performance. However, the inherent multi-step sampling process of SDEs imposes substantial computational overhead, hindering practical deployment. While existing methods adopt efficient samplers, knowledge distillation, or retraining to reduce sampling steps (e.g., from 1,000 to fewer steps), such approaches often compromise fusion quality. In this work, we propose the Optimal Transport Flow Matching (OTFM) framework, which integrates the dual formulation of unbalanced optimal transport (UOT) to achieve one-step, high-quality pansharpening. Unlike conventional OT formulations that enforce rigid distribution alignment, UOT relaxes marginal constraints to enhance modeling flexibility, accommodating the intrinsic spectral and spatial disparities in remote sensing data. Furthermore, we incorporate task-specific regularization into the UOT objective, enhancing the robustness of the flow model. The OTFM framework enables simulation-free training and single-step inference while maintaining strict adherence to pansharpening constraints. Experimental evaluations across multiple datasets demonstrate that OTFM matches or exceeds the performance of previous regression-based models and leading diffusion-based methods while only needing one sampling step. Codes are available at https://github.com/294coder/PAN-OTFM.

Paper Structure

This paper contains 23 sections, 2 theorems, 18 equations, 3 figures, 5 tables, 1 algorithm.

Key Result

Proposition 3.1

The UOT dual formulation, $C_{UOT}(\mathbb P,\mathbb Q)$, can be obtained by using the $c$-transform: where $f$ is the entropy function and $v^c$ is the $c$-transformation of $v$ in Eq. eq: c-transformation.

Figures (3)

  • Figure 1: Key distinction from previous diffusion models. Traditional diffusion models typically sample from a Gaussian distribution, requiring numerous iterative steps (e.g., 1000) to achieve results. In contrast, our OTFM harnesses the power of unbalanced optimal transport, enabling high-quality pansharpening with just one-step sampling step.
  • Figure 2: (a) Training and one-step sampling diagrams of the proposed OTFM. Due the flow matching velocity construction (see Eq. \ref{['eq: flow-matching-loss']}), the UOT mapped $\hat{y}_1$ can be simply obtained from the predicted velocity $s_{\theta,t}(y_t,t,\cdot)$. A potential network $v_{\varphi}(\cdot)$ is parameterized from UOT dual formulation (see Prop. \ref{['prop: dual-form-uot']}) and supports training a one-step mapping network. Note that conditions PAN and LRMS in the mapping network are omitted for simplicity. (b) Mapping network designs for pansharpening. The network is adopted as a U-net ronneberger2015u architecture. To inject conditions (i.e., $[p,m]$), the zero-AdaLN is used to scale, shift, gate the feature $x$.
  • Figure 3: Visual comparisons on WV3 (1-2 rows) and GaoFen-2 (3-4 rows) cases. The second the fourth rows are error maps.

Theorems & Definitions (4)

  • Remark 3.1: Properties of UOT
  • Proposition 3.1: Dual formulation of UOT
  • Remark 3.2: Possible choices of function $f$
  • Proposition 3.2: Saddle points of pansharpening-regularized UOT provide the OT maps