CMEdataset Advancing China Map Detection and Standardization with Digital Image Resources
Yan Xu, Zhenqiang Zhang, Zhiwei Zhou, Liting Geng, Yue Li, Jintao Li
TL;DR
This work tackles the lack of public datasets focusing on problematic maps in China by introducing the CME dataset, which targets five key error types affecting boundary representations and map compliance. The authors assemble 1,455 high-resolution map images from diverse sources spanning 2015–present, apply standardized preprocessing and data augmentation, and annotate the data in both YOLO and COCO formats to support multiple detection pipelines. They conduct thorough evaluation of annotation quality and broad target-detection experiments across six modern detectors, demonstrating the dataset’s effectiveness for boundary localization and small-target detection in complex map images. The dataset is openly released for academic use, with plans to expand map types and regions, aiming to advance automated map auditing, GIS data processing, and broader research on problematic-map detection technologies.
Abstract
Digital images of Chinas maps play a crucial role in map detection, particularly in ensuring national sovereignty, territorial integrity, and map compliance. However, there is currently no publicly available dataset specifically dedicated to problematic maps the CME dataset. Existing datasets primarily focus on general map data and are insufficient for effectively identifying complex issues such as national boundary misrepresentations, missing elements, and blurred boundaries. Therefore, this study creates a Problematic Map dataset that covers five key problem areas, aiming to provide diverse samples for problematic map detection technologies, support high-precision map compliance detection, and enhance map data quality and timeliness. This dataset not only provides essential resources for map compliance, national security monitoring, and map updates, but also fosters innovation and application of related technologies.
