Table of Contents
Fetching ...

Dynamic Background Reconstruction via MAE for Infrared Small Target Detection

Jingchao Peng, Haitao Zhao, Kaijie Zhao, Zhongze Wang, Lujian Yao

TL;DR

This work tackles infrared small-target detection in cluttered backgrounds by reconstructing a clean background and comparing it to the original image. The proposed Dynamic Background Reconstruction (DBR) framework combines a Dynamic Shift Window to prevent target fragmentation, a MAE-based Background Reconstruction with grid masking, and a densely connected Transformer Detection Head guided by a Weighted Dice Loss. The approach yields state-of-the-art F1-scores on MFIRST and SIRST (64.10% and 75.01%, respectively) with competitive speed, showing robustness to background complexity. The results suggest that integrating background reconstruction as a core ISTD module can significantly improve detection accuracy and reliability in challenging environments.

Abstract

Infrared small target detection (ISTD) under complex backgrounds is a difficult problem, for the differences between targets and backgrounds are not easy to distinguish. Background reconstruction is one of the methods to deal with this problem. This paper proposes an ISTD method based on background reconstruction called Dynamic Background Reconstruction (DBR). DBR consists of three modules: a dynamic shift window module (DSW), a background reconstruction module (BR), and a detection head (DH). BR takes advantage of Vision Transformers in reconstructing missing patches and adopts a grid masking strategy with a masking ratio of 50\% to reconstruct clean backgrounds without targets. To avoid dividing one target into two neighboring patches, resulting in reconstructing failure, DSW is performed before input embedding. DSW calculates offsets, according to which infrared images dynamically shift. To reduce False Positive (FP) cases caused by regarding reconstruction errors as targets, DH utilizes a structure of densely connected Transformer to further improve the detection performance. Experimental results show that DBR achieves the best F1-score on the two ISTD datasets, MFIRST (64.10\%) and SIRST (75.01\%).

Dynamic Background Reconstruction via MAE for Infrared Small Target Detection

TL;DR

This work tackles infrared small-target detection in cluttered backgrounds by reconstructing a clean background and comparing it to the original image. The proposed Dynamic Background Reconstruction (DBR) framework combines a Dynamic Shift Window to prevent target fragmentation, a MAE-based Background Reconstruction with grid masking, and a densely connected Transformer Detection Head guided by a Weighted Dice Loss. The approach yields state-of-the-art F1-scores on MFIRST and SIRST (64.10% and 75.01%, respectively) with competitive speed, showing robustness to background complexity. The results suggest that integrating background reconstruction as a core ISTD module can significantly improve detection accuracy and reliability in challenging environments.

Abstract

Infrared small target detection (ISTD) under complex backgrounds is a difficult problem, for the differences between targets and backgrounds are not easy to distinguish. Background reconstruction is one of the methods to deal with this problem. This paper proposes an ISTD method based on background reconstruction called Dynamic Background Reconstruction (DBR). DBR consists of three modules: a dynamic shift window module (DSW), a background reconstruction module (BR), and a detection head (DH). BR takes advantage of Vision Transformers in reconstructing missing patches and adopts a grid masking strategy with a masking ratio of 50\% to reconstruct clean backgrounds without targets. To avoid dividing one target into two neighboring patches, resulting in reconstructing failure, DSW is performed before input embedding. DSW calculates offsets, according to which infrared images dynamically shift. To reduce False Positive (FP) cases caused by regarding reconstruction errors as targets, DH utilizes a structure of densely connected Transformer to further improve the detection performance. Experimental results show that DBR achieves the best F1-score on the two ISTD datasets, MFIRST (64.10\%) and SIRST (75.01\%).
Paper Structure (18 sections, 10 equations, 11 figures, 2 tables, 3 algorithms)

This paper contains 18 sections, 10 equations, 11 figures, 2 tables, 3 algorithms.

Figures (11)

  • Figure 1: An ISTD method based on background reconstruction and its shortage.
  • Figure 2: The architecture of the Dynamic Background Reconstruction method (DBR).
  • Figure 3: The structure of the dynamic shift window module (DSW). The blue circle represents the target, and the sketch represents the background.
  • Figure 4: The difference between one-hot encoding, label smoothing, and ours.
  • Figure 5: The difference between DSW for a single target and multiple targets.
  • ...and 6 more figures