Table of Contents
Fetching ...

LRSAA: Large-scale Remote Sensing Image Target Recognition and Automatic Annotation

Wuzheng Dong, Yujuan Zhu, Sheng Zhang

TL;DR

LRSAA tackles large-scale remote sensing image target recognition and automatic annotation by fusing ensemble detection (YOLOv11 and MobileNetV3-SSD) with Poisson disk sampling segmentation and Enhanced IoU (EIoU) based NMS to improve accuracy and efficiency. The method partitions big images via Poisson disk sampling, trains detectors on smaller segments, then maps detections back to the original image and augments data with synthetically generated samples, including EIoU-driven bounding box refinement. Evaluations on XView and Tianjin/Xiamen/Shanghai datasets show that LRSAA outperforms standard detectors, with performance gains amplified by using 640×640 or 320×320 crops, and synthetic data augmentation further boosting accuracy and mAP. The work provides a practical pipeline for scalable remote sensing analysis and automatic annotation, with public code and implications for real-world GIS applications and disaster monitoring.

Abstract

This paper presents a method for object recognition and automatic labeling in large-area remote sensing images called LRSAA. The method integrates YOLOv11 and MobileNetV3-SSD object detection algorithms through ensemble learning to enhance model performance. Furthermore, it employs Poisson disk sampling segmentation techniques and the EIOU metric to optimize the training and inference processes of segmented images, followed by the integration of results. This approach not only reduces the demand for computational resources but also achieves a good balance between accuracy and speed. The source code for this project has been made publicly available on https://github.com/anaerovane/LRSAA.

LRSAA: Large-scale Remote Sensing Image Target Recognition and Automatic Annotation

TL;DR

LRSAA tackles large-scale remote sensing image target recognition and automatic annotation by fusing ensemble detection (YOLOv11 and MobileNetV3-SSD) with Poisson disk sampling segmentation and Enhanced IoU (EIoU) based NMS to improve accuracy and efficiency. The method partitions big images via Poisson disk sampling, trains detectors on smaller segments, then maps detections back to the original image and augments data with synthetically generated samples, including EIoU-driven bounding box refinement. Evaluations on XView and Tianjin/Xiamen/Shanghai datasets show that LRSAA outperforms standard detectors, with performance gains amplified by using 640×640 or 320×320 crops, and synthetic data augmentation further boosting accuracy and mAP. The work provides a practical pipeline for scalable remote sensing analysis and automatic annotation, with public code and implications for real-world GIS applications and disaster monitoring.

Abstract

This paper presents a method for object recognition and automatic labeling in large-area remote sensing images called LRSAA. The method integrates YOLOv11 and MobileNetV3-SSD object detection algorithms through ensemble learning to enhance model performance. Furthermore, it employs Poisson disk sampling segmentation techniques and the EIOU metric to optimize the training and inference processes of segmented images, followed by the integration of results. This approach not only reduces the demand for computational resources but also achieves a good balance between accuracy and speed. The source code for this project has been made publicly available on https://github.com/anaerovane/LRSAA.

Paper Structure

This paper contains 15 sections, 8 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Practical application effects of the LRSAA model: a demonstration of detailed annotation results on Tianjin urban remote sensing images with 0.6m precision.
  • Figure 2: Methods: Our methodology comprises four stages: Step A, B, C, and D. In Step A, we utilize Poisson disk sampling and segmentation techniques to partition the large-scale image into smaller segments. Step B involves training YOLOv11 and MobileNetV3-SSD models on these smaller images to detect objects. For Step C, the detection results from the smaller images are mapped back onto the original large-scale image to maintain spatial consistency. Finally, in Step D, we augment the original dataset with synthetic data generated through this process, allowing for re-training the models with an enriched dataset that includes proportional synthetic samples. This approach ensures continuous improvement and adaptation of the models to diverse scenarios.
  • Figure 3: Sampling with Poisson Disk, the red points represent the sampling points generated using the Poisson Disk method, while the pink box indicates the area extracted based on these Poisson Disk sampling points.