Table of Contents
Fetching ...

ACPV-Net: All-Class Polygonal Vectorization for Seamless Vector Map Generation from Aerial Imagery

Weiqin Jiao, Hao Cheng, George Vosselman, Claudio Persello

Abstract

We tackle the problem of generating a complete vector map representation from aerial imagery in a single run: producing polygons for all land-cover classes with shared boundaries and without gaps or overlaps. Existing polygonization methods are typically class-specific; extending them to multiple classes via per-class runs commonly leads to topological inconsistencies, such as duplicated edges, gaps, and overlaps. We formalize this new task as All-Class Polygonal Vectorization (ACPV) and release the first public benchmark, Deventer-512, with standardized metrics jointly evaluating semantic fidelity, geometric accuracy, vertex efficiency, per-class topological fidelity and global topological consistency. To realize ACPV, we propose ACPV-Net, a unified framework introducing a novel Semantically Supervised Conditioning (SSC) mechanism coupling semantic perception with geometric primitive generation, along with a topological reconstruction that enforces shared-edge consistency by design. While enforcing such strict topological constraints, ACPV-Net surpasses all class-specific baselines in polygon quality across classes on Deventer-512. It also applies to single-class polygonal vectorization without any architectural modification, achieving the best-reported results on WHU-Building. Data, code, and models will be released at: https://github.com/HeinzJiao/ACPV-Net.

ACPV-Net: All-Class Polygonal Vectorization for Seamless Vector Map Generation from Aerial Imagery

Abstract

We tackle the problem of generating a complete vector map representation from aerial imagery in a single run: producing polygons for all land-cover classes with shared boundaries and without gaps or overlaps. Existing polygonization methods are typically class-specific; extending them to multiple classes via per-class runs commonly leads to topological inconsistencies, such as duplicated edges, gaps, and overlaps. We formalize this new task as All-Class Polygonal Vectorization (ACPV) and release the first public benchmark, Deventer-512, with standardized metrics jointly evaluating semantic fidelity, geometric accuracy, vertex efficiency, per-class topological fidelity and global topological consistency. To realize ACPV, we propose ACPV-Net, a unified framework introducing a novel Semantically Supervised Conditioning (SSC) mechanism coupling semantic perception with geometric primitive generation, along with a topological reconstruction that enforces shared-edge consistency by design. While enforcing such strict topological constraints, ACPV-Net surpasses all class-specific baselines in polygon quality across classes on Deventer-512. It also applies to single-class polygonal vectorization without any architectural modification, achieving the best-reported results on WHU-Building. Data, code, and models will be released at: https://github.com/HeinzJiao/ACPV-Net.
Paper Structure (15 sections, 1 theorem, 3 equations, 4 figures, 7 tables)

This paper contains 15 sections, 1 theorem, 3 equations, 4 figures, 7 tables.

Key Result

Proposition 1

Let $G=(V,E)$ be a planar straight-line graph embedded in $\mathbb{R}^2$. Suppose that: Then the polygonal partition obtained by tracing all face boundaries on $G$ and assigning face labels according to $\hat{M}$ satisfies all ACPV constraints (a)–(f). A complete proof is provided in the supplementary.

Figures (4)

  • Figure 1: Overview of ACPV-Net. It unifies semantically supervised conditioning and proposition-driven topological reconstruction: the former produces coherent semantic–geometric evidence through diffusion-based vertex generation under semantic supervision, the latter deterministically reconstructs a topology-consistent vector basemap via overdense PSLG construction and vertex-guided subset selection.
  • Figure 2: Qualitative comparison on Deventer-512. The three rows show representative urban, suburban, and rural scenes, respectively. From left to right: aerial imagery, ground truth, TopDiG yang2023topdig, HiSup xu2023hisup, and Ours. Land-cover classes are color-coded; polygon outlines are drawn in black, vertices are highlighted with orange dots, and inter-class overlaps and gaps are marked in red and black, respectively.
  • Figure 3: Vertex activations under weak/ambiguous visual cues and along smooth boundaries (cartographic convention cases). From left to right: aerial image, pure discriminative decoding, without semantic supervision (No-SSC), ours, and ground truth.
  • Figure 4: Vertex peak-shape comparison between the pure discriminative baseline and our distributional reconstruction. Lower values of FWHM and Area@0.5 indicate sharper and more compact peaks, while higher Sharpness reflects stronger local contrast. The blue/red bar denotes the median/90th percentile.

Theorems & Definitions (1)

  • Proposition 1: Sufficient condition for ACPV compliance