Detecting Legend Items on Historical Maps Using GPT-4o with In-Context Learning
Sofia Kirsanova, Yao-Yi Chiang, Weiwei Duan
TL;DR
This paper tackles automatic extraction of historical map legends, enabling structured, searchable metadata from heterogeneous layouts. It combines a layout-aware legend cropping stage with GPT-4o in-context learning using structured JSON prompts to output bounding boxes linking legend symbols to descriptions without additional training. On 40 USGS maps, the method achieves about 88% F-1 and 85% IoU, with performance improving as more in-context examples are provided and peaking at 15 examples. The approach facilitates scalable, layout-aware geospatial search and indexing, though it struggles with densely packed multi-column legends, indicating directions for prompt and layout-aware improvements.
Abstract
Historical map legends are critical for interpreting cartographic symbols. However, their inconsistent layouts and unstructured formats make automatic extraction challenging. Prior work focuses primarily on segmentation or general optical character recognition (OCR), with few methods effectively matching legend symbols to their corresponding descriptions in a structured manner. We present a method that combines LayoutLMv3 for layout detection with GPT-4o using in-context learning to detect and link legend items and their descriptions via bounding box predictions. Our experiments show that GPT-4 with structured JSON prompts outperforms the baseline, achieving 88% F-1 and 85% IoU, and reveal how prompt design, example counts, and layout alignment affect performance. This approach supports scalable, layout-aware legend parsing and improves the indexing and searchability of historical maps across various visual styles.
