Table of Contents
Fetching ...

DeTurb: Atmospheric Turbulence Mitigation with Deformable 3D Convolutions and 3D Swin Transformers

Zhicheng Zou, Nantheera Anantrasirichai

TL;DR

This paper proposes a new framework that combines geometric restoration with an enhancement module that demonstrates superior performance over the state of the art for both synthetic and real atmospheric turbulence effects, with reasonable speed and model size.

Abstract

Atmospheric turbulence in long-range imaging significantly degrades the quality and fidelity of captured scenes due to random variations in both spatial and temporal dimensions. These distortions present a formidable challenge across various applications, from surveillance to astronomy, necessitating robust mitigation strategies. While model-based approaches achieve good results, they are very slow. Deep learning approaches show promise in image and video restoration but have struggled to address these spatiotemporal variant distortions effectively. This paper proposes a new framework that combines geometric restoration with an enhancement module. Random perturbations and geometric distortion are removed using a pyramid architecture with deformable 3D convolutions, resulting in aligned frames. These frames are then used to reconstruct a sharp, clear image via a multi-scale architecture of 3D Swin Transformers. The proposed framework demonstrates superior performance over the state of the art for both synthetic and real atmospheric turbulence effects, with reasonable speed and model size.

DeTurb: Atmospheric Turbulence Mitigation with Deformable 3D Convolutions and 3D Swin Transformers

TL;DR

This paper proposes a new framework that combines geometric restoration with an enhancement module that demonstrates superior performance over the state of the art for both synthetic and real atmospheric turbulence effects, with reasonable speed and model size.

Abstract

Atmospheric turbulence in long-range imaging significantly degrades the quality and fidelity of captured scenes due to random variations in both spatial and temporal dimensions. These distortions present a formidable challenge across various applications, from surveillance to astronomy, necessitating robust mitigation strategies. While model-based approaches achieve good results, they are very slow. Deep learning approaches show promise in image and video restoration but have struggled to address these spatiotemporal variant distortions effectively. This paper proposes a new framework that combines geometric restoration with an enhancement module. Random perturbations and geometric distortion are removed using a pyramid architecture with deformable 3D convolutions, resulting in aligned frames. These frames are then used to reconstruct a sharp, clear image via a multi-scale architecture of 3D Swin Transformers. The proposed framework demonstrates superior performance over the state of the art for both synthetic and real atmospheric turbulence effects, with reasonable speed and model size.
Paper Structure (19 sections, 3 equations, 5 figures, 5 tables)

This paper contains 19 sections, 3 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Diagram of the proposed DeTurb. Top-row: end-to-end framework comprising the non-rigid registration and the feature fusion modules. Bottom-row: block diagrams of 3D Swin transformer encoder and decoder blocks.
  • Figure 2: Subjective results of a synthetic scene. The bottom of each picture shows a magnified version of the straight lines.
  • Figure 3: Subjective results of a real distorted scene. The left column shows a static scene, while the other columns depict dynamic scenes. From top to bottom, the rows display the distorted input, BasicVSR++ results, TMT results, our DeTurb results, and CLEAR results.
  • Figure 4: Subjective results (similar to Fig. 11 in the DATUM paper).
  • Figure 5: Example $y$-$t$ planes of static 'Man' scene restored without and with the non-rigid registration module