Motion-Enhanced Nonlocal Similarity Implicit Neural Representation for Infrared Dim and Small Target Detection
Pei Liu, Yisi Luo, Wenzhen Wang, Xiangyong Cao
TL;DR
The paper tackles infrared dim, small-target detection in dynamic multi-frame scenes by proposing an unsupervised motion-enhanced nonlocal similarity implicit neural representation (optNL-INR). The approach combines motion estimation via optical flow, dynamic multi-frame fusion, nonlocal patch grouping, and a Tucker-decomposed INR background model with SIREN-based factor networks, optimized through 3DTV-regularized ADMM. The authors provide theoretical results on the existence, spatial-temporal smoothness, and ADMM convergence of the nonlocal INR framework, and demonstrate state-of-the-art performance on ATR and Anti-UAV datasets with strong robustness and unsupervised generalization. Overall, the method achieves superior target-background separation and reliable small-target detection in challenging infrared environments, with practical implications for surveillance and search-and-rescue applications.
Abstract
Infrared dim and small target detection presents a significant challenge due to dynamic multi-frame scenarios and weak target signatures in the infrared modality. Traditional low-rank plus sparse models often fail to capture dynamic backgrounds and global spatial-temporal correlations, which results in background leakage or target loss. In this paper, we propose a novel motion-enhanced nonlocal similarity implicit neural representation (INR) framework to address these challenges. We first integrate motion estimation via optical flow to capture subtle target movements, and propose multi-frame fusion to enhance motion saliency. Second, we leverage nonlocal similarity to construct patch tensors with strong low-rank properties, and propose an innovative tensor decomposition-based INR model to represent the nonlocal patch tensor, effectively encoding both the nonlocal low-rankness and spatial-temporal correlations of background through continuous neural representations. An alternating direction method of multipliers is developed for the nonlocal INR model, which enjoys theoretical fixed-point convergence. Experimental results show that our approach robustly separates dim targets from complex infrared backgrounds, outperforming state-of-the-art methods in detection accuracy and robustness.
