Detecting Omissions in Geographic Maps through Computer Vision
Phuc D. A. Nguyen, Anh Do, Minh Hoai
TL;DR
This work tackles the problem of detecting omissions in geographic maps, specifically identifying Vietnam-focused maps that exclude the Hoang Sa and Truong Sa islands. It proposes a four-stage computer vision pipeline—map classification, text detection, text recognition, and vocab matching—built on pre-trained architectures (EfficientNet-B4, DBNet, VietOCR) and fine-tuned with domain-specific data, including the VinText and VinMap box annotations. The authors introduce the VinMap dataset (6,858 images) to train and evaluate the approach, achieving a final F1-score of 85.51% for the task of identifying Vietnam maps that exclude both islands, and demonstrating the value of careful module ablations (pretraining, targeted text detection, and strict vocabulary matching). This work provides a practical, scalable framework for automated map analysis with potential applications in archival research, geopolitical studies, and real-time monitoring, while acknowledging challenges in handling diacritics and varied map styles.
Abstract
This paper explores the application of computer vision technologies to the analysis of maps, an area with substantial historical, cultural, and political significance. Our focus is on developing and evaluating a method for automatically identifying maps that depict specific regions and feature landmarks with designated names, a task that involves complex challenges due to the diverse styles and methods used in map creation. We address three main subtasks: differentiating maps from non-maps, verifying the accuracy of the region depicted, and confirming the presence or absence of particular landmark names through advanced text recognition techniques. Our approach utilizes a Convolutional Neural Network and transfer learning to differentiate maps from non-maps, verify the accuracy of depicted regions, and confirm landmark names through advanced text recognition. We also introduce the VinMap dataset, containing annotated map images of Vietnam, to train and test our method. Experiments on this dataset demonstrate that our technique achieves F1-score of 85.51% for identifying maps excluding specific territorial landmarks. This result suggests practical utility and indicates areas for future improvement.
