SOMA: Feature Gradient Enhanced Affine-Flow Matching for SAR-Optical Registration
Haodong Wang, Tao Zhuo, Xiuwei Zhang, Hanlin Yin, Wencong Wu, Yanning Zhang
TL;DR
This work targets pixel-level registration between SAR and optical imagery, a challenging cross-modal task due to fundamental imaging differences. It introduces SOMA, a dense registration framework that fuses a Feature Gradient Enhancer for gradient-informed features with a Global-Local Affine-Flow Matcher for coarse-to-fine alignment, aided by a frozen DINOv2 backbone. The key contributions are the gradient-based feature enhancement (FGE), the coupled affine-flow matcher (GLAM), and extensive ablations showing significant gains in CMR@1px across diverse datasets, along with solid generalization and efficient runtime. The approach enables robust, precise SAR-Optical registration suitable for multi-source data fusion under varying scenes and resolutions, advancing practical cross-modal registration tasks.
Abstract
Achieving pixel-level registration between SAR and optical images remains a challenging task due to their fundamentally different imaging mechanisms and visual characteristics. Although deep learning has achieved great success in many cross-modal tasks, its performance on SAR-Optical registration tasks is still unsatisfactory. Gradient-based information has traditionally played a crucial role in handcrafted descriptors by highlighting structural differences. However, such gradient cues have not been effectively leveraged in deep learning frameworks for SAR-Optical image matching. To address this gap, we propose SOMA, a dense registration framework that integrates structural gradient priors into deep features and refines alignment through a hybrid matching strategy. Specifically, we introduce the Feature Gradient Enhancer (FGE), which embeds multi-scale, multi-directional gradient filters into the feature space using attention and reconstruction mechanisms to boost feature distinctiveness. Furthermore, we propose the Global-Local Affine-Flow Matcher (GLAM), which combines affine transformation and flow-based refinement within a coarse-to-fine architecture to ensure both structural consistency and local accuracy. Experimental results demonstrate that SOMA significantly improves registration precision, increasing the CMR@1px by 12.29% on the SEN1-2 dataset and 18.50% on the GFGE_SO dataset. In addition, SOMA exhibits strong robustness and generalizes well across diverse scenes and resolutions.
