Table of Contents
Fetching ...

DOGE: Differentiable Bezier Graph Optimization for Road Network Extraction

Jiahui Sun, Junran Lu, Jinhui Yin, Yishuo Xu, Yuanqi Li, Yanwen Guo

TL;DR

This work tackles automatic road-network extraction from aerial imagery, where polylines fail to capture curvilinear road geometry. It introduces DOGE, a GT-free framework that represents roads as a differentiable Bézier Graph and optimizes geometry via differentiable rendering (DiffAlign) while refining topology with discrete operators (TopoAdapt). The method hinges on a parametrized cubic Bézier edge with endpoints anchored to nodes and intermediate control points governed by α and d, and it optimizes a composite loss combining data fidelity and geometric priors. DOGE achieves state-of-the-art results on SpaceNet and CityScale, producing accurate, smooth, and compact road graphs, and demonstrates the potential of GT-free vector reconstruction tasks in remote sensing.

Abstract

Automatic extraction of road networks from aerial imagery is a fundamental task, yet prevailing methods rely on polylines that struggle to model curvilinear geometry. We maintain that road geometry is inherently curve-based and introduce the Bézier Graph, a differentiable parametric curve-based representation. The primary obstacle to this representation is to obtain the difficult-to-construct vector ground-truth (GT). We sidestep this bottleneck by reframing the task as a global optimization problem over the Bézier Graph. Our framework, DOGE, operationalizes this paradigm by learning a parametric Bézier Graph directly from segmentation masks, eliminating the need for curve GT. DOGE holistically optimizes the graph by alternating between two complementary modules: DiffAlign continuously optimizes geometry via differentiable rendering, while TopoAdapt uses discrete operators to refine its topology. Our method sets a new state-of-the-art on the large-scale SpaceNet and CityScale benchmarks, presenting a new paradigm for generating high-fidelity vector maps of road networks. We will release our code and related data.

DOGE: Differentiable Bezier Graph Optimization for Road Network Extraction

TL;DR

This work tackles automatic road-network extraction from aerial imagery, where polylines fail to capture curvilinear road geometry. It introduces DOGE, a GT-free framework that represents roads as a differentiable Bézier Graph and optimizes geometry via differentiable rendering (DiffAlign) while refining topology with discrete operators (TopoAdapt). The method hinges on a parametrized cubic Bézier edge with endpoints anchored to nodes and intermediate control points governed by α and d, and it optimizes a composite loss combining data fidelity and geometric priors. DOGE achieves state-of-the-art results on SpaceNet and CityScale, producing accurate, smooth, and compact road graphs, and demonstrates the potential of GT-free vector reconstruction tasks in remote sensing.

Abstract

Automatic extraction of road networks from aerial imagery is a fundamental task, yet prevailing methods rely on polylines that struggle to model curvilinear geometry. We maintain that road geometry is inherently curve-based and introduce the Bézier Graph, a differentiable parametric curve-based representation. The primary obstacle to this representation is to obtain the difficult-to-construct vector ground-truth (GT). We sidestep this bottleneck by reframing the task as a global optimization problem over the Bézier Graph. Our framework, DOGE, operationalizes this paradigm by learning a parametric Bézier Graph directly from segmentation masks, eliminating the need for curve GT. DOGE holistically optimizes the graph by alternating between two complementary modules: DiffAlign continuously optimizes geometry via differentiable rendering, while TopoAdapt uses discrete operators to refine its topology. Our method sets a new state-of-the-art on the large-scale SpaceNet and CityScale benchmarks, presenting a new paradigm for generating high-fidelity vector maps of road networks. We will release our code and related data.

Paper Structure

This paper contains 20 sections, 10 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Polyline versus curve-based road representations. (a) A polyline approximates a curve with discrete polylines. (b) A parametric Bézier curve representation. A road segment is shown with its four control points (red) that define its geometry.
  • Figure 2: Parametric definition of a Bézier Graph edge $e_k$. The edge's geometry is defined by a cubic Bézier curve with four control points, $\{\boldsymbol{P}_{k,r}\}_{r=0}^3$, and an optimizable width, $w_k$. The endpoints $\boldsymbol{P}_{k,0}$ and $\boldsymbol{P}_{k,3}$ are anchored to the node positions, while the intermediate points $\boldsymbol{P}_{k,1}$ and $\boldsymbol{P}_{k,2}$ control the curvature.
  • Figure 3: Overview of the DOGE framework. Given a satellite image, a fine-tuned SAM2 provides a target road segmentation $\mathcal{S}$. DOGE reconstructs the road network by iteratively optimizing a Bézier Graph $\mathcal{G}$ (\ref{['subsec:bezier_graph']}). The optimization loop alternates between two complementary modules: DiffAlign, which continuously refines the graph's geometry by aligning a differentiable rendering of the graph with $\mathcal{S}$ (\ref{['subsec:diffalign']}), and TopoAdapt, which discretely evolves the graph's topology (\ref{['subsec:topoadapt']}).
  • Figure 4: Optimization dynamics of the Bézier Graph. This figure illustrates the interplay between DiffAlign and TopoAdapt. Key operations are highlighted: graph initialization (iter 0); geometric optimization towards the target (iter 10); overlap separation driven by $\mathcal{L}_{\text{overlap}}$ (iter 20); road addition (iter 30); node merging (iter 40); T-junction creation (iter 50); and collinear edge merging (iter 60).
  • Figure 5: Qualitative comparison on SpaceNet (top two rows) and City-Scale (bottom two rows). Our method produces geometrically precise, smooth, and topologically correct road graphs, outperforming prior methods across different scales. Notably, our approach uses a more compact graph representation with fewer nodes.
  • ...and 2 more figures