Table of Contents
Fetching ...

3MOS: Multi-sources, Multi-resolutions, and Multi-scenes dataset for Optical-SAR image matching

Yibin Ye, Xichao Teng, Shuo Chen, Yijie Bian, Tao Tan, Zhang Li

TL;DR

3MOS tackles the lack of diverse, large-scale Optical-SAR matching data by compiling 155K image pairs from six SAR sensors across varied resolutions (1.25–12.5 m) and eight ground scenes. It introduces a baseline multi-scale feature network (MFN) and conducts extensive cross-source, cross-resolution, and cross-scene template-matching evaluations to reveal that no single method consistently dominates, highlighting a significant domain adaptation challenge. The dataset construction pipeline includes careful data collection, manual registration, scene classification, and rigorous quality control, with data split across satellites and scenes to stress-test generalization. The work underscores the practical importance for multimodal fusion and visual navigation, and provides public data and code to spur development of robust, domain-agnostic optical-SAR matching approaches. A key implication is the need for domain adaptation strategies to generalize across sensors, resolutions, and scene types in real-world remote sensing tasks.

Abstract

Optical-SAR image matching is a fundamental task for image fusion and visual navigation. However, all large-scale open SAR dataset for methods development are collected from single platform, resulting in limited satellite types and spatial resolutions. Since images captured by different sensors vary significantly in both geometric and radiometric appearance, existing methods may fail to match corresponding regions containing the same content. Besides, most of existing datasets have not been categorized based on the characteristics of different scenes. To encourage the design of more general multi-modal image matching methods, we introduce a large-scale Multi-sources,Multi-resolutions, and Multi-scenes dataset for Optical-SAR image matching(3MOS). It consists of 155K optical-SAR image pairs, including SAR data from six commercial satellites, with resolutions ranging from 1.25m to 12.5m. The data has been classified into eight scenes including urban, rural, plains, hills, mountains, water, desert, and frozen earth. Extensively experiments show that none of state-of-the-art methods achieve consistently superior performance across different sources, resolutions and scenes. In addition, the distribution of data has a substantial impact on the matching capability of deep learning models, this proposes the domain adaptation challenge in optical-SAR image matching. Our data and code will be available at:https://github.com/3M-OS/3MOS.

3MOS: Multi-sources, Multi-resolutions, and Multi-scenes dataset for Optical-SAR image matching

TL;DR

3MOS tackles the lack of diverse, large-scale Optical-SAR matching data by compiling 155K image pairs from six SAR sensors across varied resolutions (1.25–12.5 m) and eight ground scenes. It introduces a baseline multi-scale feature network (MFN) and conducts extensive cross-source, cross-resolution, and cross-scene template-matching evaluations to reveal that no single method consistently dominates, highlighting a significant domain adaptation challenge. The dataset construction pipeline includes careful data collection, manual registration, scene classification, and rigorous quality control, with data split across satellites and scenes to stress-test generalization. The work underscores the practical importance for multimodal fusion and visual navigation, and provides public data and code to spur development of robust, domain-agnostic optical-SAR matching approaches. A key implication is the need for domain adaptation strategies to generalize across sensors, resolutions, and scene types in real-world remote sensing tasks.

Abstract

Optical-SAR image matching is a fundamental task for image fusion and visual navigation. However, all large-scale open SAR dataset for methods development are collected from single platform, resulting in limited satellite types and spatial resolutions. Since images captured by different sensors vary significantly in both geometric and radiometric appearance, existing methods may fail to match corresponding regions containing the same content. Besides, most of existing datasets have not been categorized based on the characteristics of different scenes. To encourage the design of more general multi-modal image matching methods, we introduce a large-scale Multi-sources,Multi-resolutions, and Multi-scenes dataset for Optical-SAR image matching(3MOS). It consists of 155K optical-SAR image pairs, including SAR data from six commercial satellites, with resolutions ranging from 1.25m to 12.5m. The data has been classified into eight scenes including urban, rural, plains, hills, mountains, water, desert, and frozen earth. Extensively experiments show that none of state-of-the-art methods achieve consistently superior performance across different sources, resolutions and scenes. In addition, the distribution of data has a substantial impact on the matching capability of deep learning models, this proposes the domain adaptation challenge in optical-SAR image matching. Our data and code will be available at:https://github.com/3M-OS/3MOS.
Paper Structure (19 sections, 16 figures, 6 tables)

This paper contains 19 sections, 16 figures, 6 tables.

Figures (16)

  • Figure 1: Overview of 3MOS dataset. 3MOS contains 155K optical-SAR image pairs, including SAR data from 6 commercial satellites. The data is registered and classified into 8 scenes including urban, rural, plains, hills, mountains, water, desert, and frozen earth.
  • Figure 2: Workflow of 3MOS dataset construction procedure.
  • Figure 3: Selected control points for image registration and inspect the registration error manually.
  • Figure 4: The flowchart of image scene classification.
  • Figure 5: The useless images have been deleted.
  • ...and 11 more figures