Table of Contents
Fetching ...

Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation

Xingguang Zhang, Nicholas Chimitt, Xijun Wang, Yu Yuan, Stanley H. Chan

TL;DR

Atmospheric turbulence severely degrades long-range video, challenging restoration and downstream tasks. The authors introduce MambaTM, a turbulence mitigation framework that jointly estimates latent phase distortion (LPD) and restores video using Selective State Space Models (Mamba) with linear complexity and a global receptive field, guided by a learned LPD. The LPD representation, learned via a VAE and used to steer a phase-distortion guided SSM, enhances degradation estimation and enables efficient end-to-end training through a ReBlurNet re-degradation pathway. Across synthetic and real turbulence benchmarks, MambaTM achieves state-of-the-art restoration quality with significantly faster inference and robust generalization, offering a practical solution for real-world video turbulence mitigation.

Abstract

Atmospheric turbulence is a major source of image degradation in long-range imaging systems. Although numerous deep learning-based turbulence mitigation (TM) methods have been proposed, many are slow, memory-hungry, and do not generalize well. In the spatial domain, methods based on convolutional operators have a limited receptive field, so they cannot handle a large spatial dependency required by turbulence. In the temporal domain, methods relying on self-attention can, in theory, leverage the lucky effects of turbulence, but their quadratic complexity makes it difficult to scale to many frames. Traditional recurrent aggregation methods face parallelization challenges. In this paper, we present a new TM method based on two concepts: (1) A turbulence mitigation network based on the Selective State Space Model (MambaTM). MambaTM provides a global receptive field in each layer across spatial and temporal dimensions while maintaining linear computational complexity. (2) Learned Latent Phase Distortion (LPD). LPD guides the state space model. Unlike classical Zernike-based representations of phase distortion, the new LPD map uniquely captures the actual effects of turbulence, significantly improving the model's capability to estimate degradation by reducing the ill-posedness. Our proposed method exceeds current state-of-the-art networks on various synthetic and real-world TM benchmarks with significantly faster inference speed.

Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation

TL;DR

Atmospheric turbulence severely degrades long-range video, challenging restoration and downstream tasks. The authors introduce MambaTM, a turbulence mitigation framework that jointly estimates latent phase distortion (LPD) and restores video using Selective State Space Models (Mamba) with linear complexity and a global receptive field, guided by a learned LPD. The LPD representation, learned via a VAE and used to steer a phase-distortion guided SSM, enhances degradation estimation and enables efficient end-to-end training through a ReBlurNet re-degradation pathway. Across synthetic and real turbulence benchmarks, MambaTM achieves state-of-the-art restoration quality with significantly faster inference and robust generalization, offering a practical solution for real-world video turbulence mitigation.

Abstract

Atmospheric turbulence is a major source of image degradation in long-range imaging systems. Although numerous deep learning-based turbulence mitigation (TM) methods have been proposed, many are slow, memory-hungry, and do not generalize well. In the spatial domain, methods based on convolutional operators have a limited receptive field, so they cannot handle a large spatial dependency required by turbulence. In the temporal domain, methods relying on self-attention can, in theory, leverage the lucky effects of turbulence, but their quadratic complexity makes it difficult to scale to many frames. Traditional recurrent aggregation methods face parallelization challenges. In this paper, we present a new TM method based on two concepts: (1) A turbulence mitigation network based on the Selective State Space Model (MambaTM). MambaTM provides a global receptive field in each layer across spatial and temporal dimensions while maintaining linear computational complexity. (2) Learned Latent Phase Distortion (LPD). LPD guides the state space model. Unlike classical Zernike-based representations of phase distortion, the new LPD map uniquely captures the actual effects of turbulence, significantly improving the model's capability to estimate degradation by reducing the ill-posedness. Our proposed method exceeds current state-of-the-art networks on various synthetic and real-world TM benchmarks with significantly faster inference speed.

Paper Structure

This paper contains 28 sections, 7 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: The proposed MambaTM network. The RDB means residual dense block Zhang_2021_a, and CAB denotes the channel attention blocks Zhang_2018_a. SFMB, TFMB, LHMB means space-first, time-first, and local Hilbert Mamba blocks. The initial "G" indicates "guided". Please zoom in for a better view.
  • Figure 2: The learning scheme of the latent phase distortion representation and ReBlur Network. Both the Zernike encoder and ReBlurNet are tiny NAFNet chen2022simple, and the ReBlurNet's encoder part is modulated by the latent blur feature $\mathbf{b}$. Please zoom in for a better view.
  • Figure 3: Qualitative comparison on the turbulence-text dataset UG2. The input frames (a) are from the 1st and 41st sequences in UG2. Note that although (f) has a stronger contrast, it still contains much more color noise, which can be observed more clearly by zooming in.
  • Figure 4: Qualitative comparison on the BRIAR dataset cornett2023expanding. The subject has given consent to publish the image.
  • Figure 5: Example of the blur representation in a real-world case from the BRIAR dataset cornett2023expanding. We can view $\boldsymbol{\mu}$ as the blur strength bias, $\boldsymbol{\sigma}$ as the uncertainty measurement, and $\widetilde{\mathbf{a}}$ as a sample of scene-invariant turbulence profile.
  • ...and 4 more figures