Table of Contents
Fetching ...

SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection

Chun-Jung Lin, Tat-Jun Chin, Sourav Garg, Feras Dayoub

TL;DR

SceneEdited addresses the problem of keeping city-scale HD maps current by formalizing image-guided 3D map updating through the MapUpdate operator, which fuses an outdated PCM $P_{out}$ with current observations $\,mathcal{I}_{curr}$ to produce $P_{upd}$ that approximates the ground-truth $P^*_{upd}$ (optionally using a change mask $C_{curr}$ from ChangeDetect). It introduces SceneEdited, a city-scale dataset containing over 800 up-to-date scenes and ~2K outdated variants with more than 23K changed objects, along with aligned RGB images, LiDAR scans, dense change masks, and a scalable automatic editing toolkit for reproducible experiments. The paper defines robust evaluation metrics combining $D_C$, $D_H$, $D_{MH}$, and $D_{MP}$ to measure geometric update quality and analyzes image-based PCM updating via point addition and deletion under controlled conditions, using ground-truth change maps to isolate geometry and registration effects. By releasing both the dataset and toolkit publicly, it enables reproducible benchmarking of 3D map maintenance and highlights key challenges in integrating image-derived changes into PCM while maintaining global geometric integrity. Overall, SceneEdited provides a practical, scalable foundation for advancing HD map maintenance with image-guided 3D updates in real urban environments.

Abstract

Accurate, up-to-date High-Definition (HD) maps are critical for urban planning, infrastructure monitoring, and autonomous navigation. However, these maps quickly become outdated as environments evolve, creating a need for robust methods that not only detect changes but also incorporate them into updated 3D representations. While change detection techniques have advanced significantly, there remains a clear gap between detecting changes and actually updating 3D maps, particularly when relying on 2D image-based change detection. To address this gap, we introduce SceneEdited, the first city-scale dataset explicitly designed to support research on HD map maintenance through 3D point cloud updating. SceneEdited contains over 800 up-to-date scenes covering 73 km of driving and approximate 3 $\text{km}^2$ of urban area, with more than 23,000 synthesized object changes created both manually and automatically across 2000+ out-of-date versions, simulating realistic urban modifications such as missing roadside infrastructure, buildings, overpasses, and utility poles. Each scene includes calibrated RGB images, LiDAR scans, and detailed change masks for training and evaluation. We also provide baseline methods using a foundational image-based structure-from-motion pipeline for updating outdated scenes, as well as a comprehensive toolkit supporting scalability, trackability, and portability for future dataset expansion and unification of out-of-date object annotations. Both the dataset and the toolkit are publicly available at https://github.com/ChadLin9596/ScenePoint-ETK, establising a standardized benchmark for 3D map updating research.

SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection

TL;DR

SceneEdited addresses the problem of keeping city-scale HD maps current by formalizing image-guided 3D map updating through the MapUpdate operator, which fuses an outdated PCM with current observations to produce that approximates the ground-truth (optionally using a change mask from ChangeDetect). It introduces SceneEdited, a city-scale dataset containing over 800 up-to-date scenes and ~2K outdated variants with more than 23K changed objects, along with aligned RGB images, LiDAR scans, dense change masks, and a scalable automatic editing toolkit for reproducible experiments. The paper defines robust evaluation metrics combining , , , and to measure geometric update quality and analyzes image-based PCM updating via point addition and deletion under controlled conditions, using ground-truth change maps to isolate geometry and registration effects. By releasing both the dataset and toolkit publicly, it enables reproducible benchmarking of 3D map maintenance and highlights key challenges in integrating image-derived changes into PCM while maintaining global geometric integrity. Overall, SceneEdited provides a practical, scalable foundation for advancing HD map maintenance with image-guided 3D updates in real urban environments.

Abstract

Accurate, up-to-date High-Definition (HD) maps are critical for urban planning, infrastructure monitoring, and autonomous navigation. However, these maps quickly become outdated as environments evolve, creating a need for robust methods that not only detect changes but also incorporate them into updated 3D representations. While change detection techniques have advanced significantly, there remains a clear gap between detecting changes and actually updating 3D maps, particularly when relying on 2D image-based change detection. To address this gap, we introduce SceneEdited, the first city-scale dataset explicitly designed to support research on HD map maintenance through 3D point cloud updating. SceneEdited contains over 800 up-to-date scenes covering 73 km of driving and approximate 3 of urban area, with more than 23,000 synthesized object changes created both manually and automatically across 2000+ out-of-date versions, simulating realistic urban modifications such as missing roadside infrastructure, buildings, overpasses, and utility poles. Each scene includes calibrated RGB images, LiDAR scans, and detailed change masks for training and evaluation. We also provide baseline methods using a foundational image-based structure-from-motion pipeline for updating outdated scenes, as well as a comprehensive toolkit supporting scalability, trackability, and portability for future dataset expansion and unification of out-of-date object annotations. Both the dataset and the toolkit are publicly available at https://github.com/ChadLin9596/ScenePoint-ETK, establising a standardized benchmark for 3D map updating research.

Paper Structure

This paper contains 19 sections, 7 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: SceneEdited task overview. RGB images show the current urban scene, whereas the stored 3-D HD map has been synthetically altered by adding and removing objects. Our benchmark challenges methods to detect these changes from the images and automatically update the 3-D map.
  • Figure 2: SceneEdited – Outdated and Updated Scene PCM Visualization: We compare the outdated PCM $P_{\text{out}}$ and the ground truth updated PCM $P^*_{\text{upd}}$ using both top-down and third-person views. The top-down view is rendered with orthographic projection, while the third-person view uses perspective projection. In the top-down view, point clouds are colored by altitude to highlight building structures such as walls. We also demonstrate the PCM's geometric accuracy by measuring wall thickness. In the third-person view, point clouds are rendered using LiDAR intensity, where road markings typically appear with higher intensity values. This makes lane markings clearly distinguishable in the visualization.
  • Figure 3: Overview of the processing pipeline from raw LiDAR scans to $P^*_{\text{upd}}$ and $P_{\text{out}}$. (a) Raw LiDAR points with dynamic cuboids (blue) and static cuboids (orange). (b) LiDAR points after removing dynamic objects. (c) Voxelized static points ($P^*_{\text{upd}}$). (d--f) Examples of outdated scenes ($P_{\text{out}}$): removing static objects, inserting new objects (green), and manually removing buildings, respectively. Note that the orientation of bounding boxes changes between (b) and (c) because we recompute their orientation using the principal eigenvector of the points inside each box. These bounding boxes are used solely for quantization and visualization purposes.
  • Figure 4: SceneEdited – Current Urban Scene and Depth Images: We visualize the current urban scene image $\mathcal{I}_c$ alongside the corresponding depth images generated from the current LiDAR $\mathcal{L}_c$, the outdated PCM $P_{\text{out}}$, and the ground truth updated PCM $P^*_{\text{upd}}$. To enhance the visibility of the sparse depth images, we apply color rendering within a 25-meter range. Bounding boxes are overlaid on the depth images of $P_{\text{out}}$ and $P^*_{\text{upd}}$ to indicate obsolete and missing objects with respect to $P_{\text{out}}$. The 2D pixel-level change map $C_{\text{curr}}$ is also shown on the image.
  • Figure 5: Overview of the point addition from image-to-3D predictor pipeline from querying a scene to $P_{\text{add}}$.
  • ...and 6 more figures