Table of Contents
Fetching ...

Multi-View Pose-Agnostic Change Localization with Zero Labels

Chamuditha Jayanga Galappaththige, Jason Lai, Lloyd Windrim, Donald Dansereau, Niko Suenderhauf, Dimity Miller

TL;DR

This work tackles label-free, pose-agnostic change localization in unconstrained 3D environments by introducing Change-3DGS, a multi-view 3D Gaussian Splatting representation that explicitly encodes change at the 3D level. The approach builds a reference 3DGS from pre-change views, derives feature- and structure-aware per-view change cues (via DINOv2 features and SSIM), and learns per-Gaussian change channels to generate multi-view change masks for unseen viewpoints. It further uses data augmentation and initialization strategies to leverage prior scene structure, achieving state-of-the-art results on multi-object scenes and across challenging datasets (MAD-Real, ChangeSim, PASLCD), while introducing PASLCD as a real-world benchmark with lighting variations. The method demonstrates robust performance with limited inference views and provides a versatile, multi-view extension that can enhance existing change-detection pipelines by enforcing cross-view consistency in a 3D scene representation.

Abstract

Autonomous agents often require accurate methods for detecting and localizing changes in their environment, particularly when observations are captured from unconstrained and inconsistent viewpoints. We propose a novel label-free, pose-agnostic change detection method that integrates information from multiple viewpoints to construct a change-aware 3D Gaussian Splatting (3DGS) representation of the scene. With as few as 5 images of the post-change scene, our approach can learn an additional change channel in a 3DGS and produce change masks that outperform single-view techniques. Our change-aware 3D scene representation additionally enables the generation of accurate change masks for unseen viewpoints. Experimental results demonstrate state-of-the-art performance in complex multi-object scenes, achieving a 1.7x and 1.5x improvement in Mean Intersection Over Union and F1 score respectively over other baselines. We also contribute a new real-world dataset to benchmark change detection in diverse challenging scenes in the presence of lighting variations.

Multi-View Pose-Agnostic Change Localization with Zero Labels

TL;DR

This work tackles label-free, pose-agnostic change localization in unconstrained 3D environments by introducing Change-3DGS, a multi-view 3D Gaussian Splatting representation that explicitly encodes change at the 3D level. The approach builds a reference 3DGS from pre-change views, derives feature- and structure-aware per-view change cues (via DINOv2 features and SSIM), and learns per-Gaussian change channels to generate multi-view change masks for unseen viewpoints. It further uses data augmentation and initialization strategies to leverage prior scene structure, achieving state-of-the-art results on multi-object scenes and across challenging datasets (MAD-Real, ChangeSim, PASLCD), while introducing PASLCD as a real-world benchmark with lighting variations. The method demonstrates robust performance with limited inference views and provides a versatile, multi-view extension that can enhance existing change-detection pipelines by enforcing cross-view consistency in a 3D scene representation.

Abstract

Autonomous agents often require accurate methods for detecting and localizing changes in their environment, particularly when observations are captured from unconstrained and inconsistent viewpoints. We propose a novel label-free, pose-agnostic change detection method that integrates information from multiple viewpoints to construct a change-aware 3D Gaussian Splatting (3DGS) representation of the scene. With as few as 5 images of the post-change scene, our approach can learn an additional change channel in a 3DGS and produce change masks that outperform single-view techniques. Our change-aware 3D scene representation additionally enables the generation of accurate change masks for unseen viewpoints. Experimental results demonstrate state-of-the-art performance in complex multi-object scenes, achieving a 1.7x and 1.5x improvement in Mean Intersection Over Union and F1 score respectively over other baselines. We also contribute a new real-world dataset to benchmark change detection in diverse challenging scenes in the presence of lighting variations.

Paper Structure

This paper contains 33 sections, 4 equations, 19 figures, 9 tables.

Figures (19)

  • Figure 1: Our multi-view approach to visual change detection (second row from bottom) enforces consistency of the predicted changes across multiple viewpoints by embedding change information in a 3D Gaussian Splatting model of the scene. This effectively suppresses many of the false-positive detections exhibited by current single-view methods (middle row).
  • Figure 2: An overview of our proposed approach for multi-view pose-agnostic change detection. We leverage a 3DGS representation of the pre-change (reference) scene to build feature and structure-aware change masks given images of the post-change (inference) scene. We embed this information as additional change channels into the representation, which can be used to render multi-view change masks.
  • Figure 3: Qualitative results of each approach on our PASLCD dataset. See Supp. Material for additional visualizations. Our generated change masks consistently agree more closely with the ground truth compared to the baselines.
  • Figure 4: Performance with varying numbers of inference views.
  • Figure 5: An overview of our data augmentation method. We concatenate the candidate masks $(\mathcal{M}_{\text{F,S}})_{\text{inf}}$ generated following Fig. \ref{['fig:overview']} with candidate masks $(\mathcal{M}_{\text{F,S}})_{\text{ref}}$ obtained by considering the inference scene's representation viewed from the reference scene's poses.
  • ...and 14 more figures