Table of Contents
Fetching ...

Detecting Omissions in Geographic Maps through Computer Vision

Phuc D. A. Nguyen, Anh Do, Minh Hoai

TL;DR

This work tackles the problem of detecting omissions in geographic maps, specifically identifying Vietnam-focused maps that exclude the Hoang Sa and Truong Sa islands. It proposes a four-stage computer vision pipeline—map classification, text detection, text recognition, and vocab matching—built on pre-trained architectures (EfficientNet-B4, DBNet, VietOCR) and fine-tuned with domain-specific data, including the VinText and VinMap box annotations. The authors introduce the VinMap dataset (6,858 images) to train and evaluate the approach, achieving a final F1-score of 85.51% for the task of identifying Vietnam maps that exclude both islands, and demonstrating the value of careful module ablations (pretraining, targeted text detection, and strict vocabulary matching). This work provides a practical, scalable framework for automated map analysis with potential applications in archival research, geopolitical studies, and real-time monitoring, while acknowledging challenges in handling diacritics and varied map styles.

Abstract

This paper explores the application of computer vision technologies to the analysis of maps, an area with substantial historical, cultural, and political significance. Our focus is on developing and evaluating a method for automatically identifying maps that depict specific regions and feature landmarks with designated names, a task that involves complex challenges due to the diverse styles and methods used in map creation. We address three main subtasks: differentiating maps from non-maps, verifying the accuracy of the region depicted, and confirming the presence or absence of particular landmark names through advanced text recognition techniques. Our approach utilizes a Convolutional Neural Network and transfer learning to differentiate maps from non-maps, verify the accuracy of depicted regions, and confirm landmark names through advanced text recognition. We also introduce the VinMap dataset, containing annotated map images of Vietnam, to train and test our method. Experiments on this dataset demonstrate that our technique achieves F1-score of 85.51% for identifying maps excluding specific territorial landmarks. This result suggests practical utility and indicates areas for future improvement.

Detecting Omissions in Geographic Maps through Computer Vision

TL;DR

This work tackles the problem of detecting omissions in geographic maps, specifically identifying Vietnam-focused maps that exclude the Hoang Sa and Truong Sa islands. It proposes a four-stage computer vision pipeline—map classification, text detection, text recognition, and vocab matching—built on pre-trained architectures (EfficientNet-B4, DBNet, VietOCR) and fine-tuned with domain-specific data, including the VinText and VinMap box annotations. The authors introduce the VinMap dataset (6,858 images) to train and evaluate the approach, achieving a final F1-score of 85.51% for the task of identifying Vietnam maps that exclude both islands, and demonstrating the value of careful module ablations (pretraining, targeted text detection, and strict vocabulary matching). This work provides a practical, scalable framework for automated map analysis with potential applications in archival research, geopolitical studies, and real-time monitoring, while acknowledging challenges in handling diacritics and varied map styles.

Abstract

This paper explores the application of computer vision technologies to the analysis of maps, an area with substantial historical, cultural, and political significance. Our focus is on developing and evaluating a method for automatically identifying maps that depict specific regions and feature landmarks with designated names, a task that involves complex challenges due to the diverse styles and methods used in map creation. We address three main subtasks: differentiating maps from non-maps, verifying the accuracy of the region depicted, and confirming the presence or absence of particular landmark names through advanced text recognition techniques. Our approach utilizes a Convolutional Neural Network and transfer learning to differentiate maps from non-maps, verify the accuracy of depicted regions, and confirm landmark names through advanced text recognition. We also introduce the VinMap dataset, containing annotated map images of Vietnam, to train and test our method. Experiments on this dataset demonstrate that our technique achieves F1-score of 85.51% for identifying maps excluding specific territorial landmarks. This result suggests practical utility and indicates areas for future improvement.
Paper Structure (13 sections, 4 figures, 4 tables)

This paper contains 13 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Our proposed model, pre-trained on high-quality VinMap datasets, demonstrates the capability to recognize the Vietnam map and determine whether it includes the Hoang Sa or Truong Sa regions from multi-resolution input map images.
  • Figure 2: The VinMap dataset comprises high-quality images in both English and Vietnamese. Tailored specifically for Vietnam, the Vietnam map set encompasses maps depicting various contexts of Vietnam, whereas the Not Vietnam map set comprises map images from diverse countries and regions.
  • Figure 3: Annotation visualization of the Vietnam map image and our provided box annotation for regions of interest.
  • Figure 4: The proposed pipeline for VinMap comprises four stages. Initially, the Map Classification module determines whether a given input image is a Vietnam map. If affirmative, the Map Text Detection module identifies all text regions within the map using computer vision techniques. Subsequently, the Map Text Recognition module scans these detected text regions and predicts the corresponding texts using an OCR model. Finally, the predicted texts are compared with a predefined policy using Levenshtein distance by the Map Vocab Matching to ascertain whether the input map image contains key regions.