Table of Contents
Fetching ...

BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction

Yuguang Li, Ivaylo Boyadzhiev, Zixuan Liu, Linda Shapiro, Alex Colburn

TL;DR

BADGR introduces a diffusion-conditioned bundle adjustment framework that jointly refines camera poses and floor-plan layouts from sparse 360° panoramas. It integrates a planar BA layer, a Column Geometry Encoder, and a Transformer-based denoiser within a non-Markovian DDPM, guided by learned layout-structural constraints such as wall adjacency and collinearity. Trained exclusively on 2D floor plans, BADGR achieves state-of-the-art accuracy in wide-baseline indoor reconstruction, demonstrating strong generalization across densities and datasets (FloorPlan-60K, ZInD, and RPLAN) and robust resilience to boundary noise. The approach enables scalable, view-consistent floor-plan reconstruction, with practical implications for virtual tours, interior design, and navigation, and supports data augmentation through synthetic poses and layouts.

Abstract

Reconstructing precise camera poses and floor plan layouts from wide-baseline RGB panoramas is a difficult and unsolved problem. We introduce BADGR, a novel diffusion model that jointly performs reconstruction and bundle adjustment (BA) to refine poses and layouts from a coarse state, using 1D floor boundary predictions from dozens of images of varying input densities. Unlike a guided diffusion model, BADGR is conditioned on dense per-entity outputs from a single-step Levenberg Marquardt (LM) optimizer and is trained to predict camera and wall positions while minimizing reprojection errors for view-consistency. The objective of layout generation from denoising diffusion process complements BA optimization by providing additional learned layout-structural constraints on top of the co-visible features across images. These constraints help BADGR to make plausible guesses on spatial relations which help constrain pose graph, such as wall adjacency, collinearity, and learn to mitigate errors from dense boundary observations with global contexts. BADGR trains exclusively on 2D floor plans, simplifying data acquisition, enabling robust augmentation, and supporting variety of input densities. Our experiments and analysis validate our method, which significantly outperforms the state-of-the-art pose and floor plan layout reconstruction with different input densities.

BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction

TL;DR

BADGR introduces a diffusion-conditioned bundle adjustment framework that jointly refines camera poses and floor-plan layouts from sparse 360° panoramas. It integrates a planar BA layer, a Column Geometry Encoder, and a Transformer-based denoiser within a non-Markovian DDPM, guided by learned layout-structural constraints such as wall adjacency and collinearity. Trained exclusively on 2D floor plans, BADGR achieves state-of-the-art accuracy in wide-baseline indoor reconstruction, demonstrating strong generalization across densities and datasets (FloorPlan-60K, ZInD, and RPLAN) and robust resilience to boundary noise. The approach enables scalable, view-consistent floor-plan reconstruction, with practical implications for virtual tours, interior design, and navigation, and supports data augmentation through synthetic poses and layouts.

Abstract

Reconstructing precise camera poses and floor plan layouts from wide-baseline RGB panoramas is a difficult and unsolved problem. We introduce BADGR, a novel diffusion model that jointly performs reconstruction and bundle adjustment (BA) to refine poses and layouts from a coarse state, using 1D floor boundary predictions from dozens of images of varying input densities. Unlike a guided diffusion model, BADGR is conditioned on dense per-entity outputs from a single-step Levenberg Marquardt (LM) optimizer and is trained to predict camera and wall positions while minimizing reprojection errors for view-consistency. The objective of layout generation from denoising diffusion process complements BA optimization by providing additional learned layout-structural constraints on top of the co-visible features across images. These constraints help BADGR to make plausible guesses on spatial relations which help constrain pose graph, such as wall adjacency, collinearity, and learn to mitigate errors from dense boundary observations with global contexts. BADGR trains exclusively on 2D floor plans, simplifying data acquisition, enabling robust augmentation, and supporting variety of input densities. Our experiments and analysis validate our method, which significantly outperforms the state-of-the-art pose and floor plan layout reconstruction with different input densities.

Paper Structure

This paper contains 22 sections, 3 equations, 12 figures, 7 tables, 1 algorithm.

Figures (12)

  • Figure 1: Overview of BADGR, a diffusion-based bundle adjustment (BA) model for generating precise, view-consistent camera poses and floor plan layouts. BADGR uses per-image floor boundaries and image column-to-wall assignments (upper left) as coarse input, refining poses and layouts through a gradient-conditioned denoising process (upper right). The bottom right shows view consistency by projecting the output layouts with the estimated poses.
  • Figure 2: Architecture of BADGR. The forward process takes a ground truth scene, i.e. layouts and poses, adds noise to sample step t. The inference process uses a transformer, conditioned on dense per-column adjustments generated by the planar BA layer and compressed by the Column Geometry Encoder.
  • Figure 3: Column-wise planar BA module. Positional adjustment for walls and cameras are computed for each image column. At each column, the associated wall$l_{m,k}$ is projected with the current camera pose $E^i$. The adjustments are computed by comparing the projected point to the floor boundary$\Tilde{\mathcal{B}}^{i, c}$ value. The dense per-column adjustments are estimated in parallel with our BA layer implementation.
  • Figure 4: Qualitative results: top-down layouts and poses before (left), after BADGR optimization (middle), and GT (right). The reprojected geometry, before and after optimization, is shown in several images, highlighting the improved view-consistency, border colors indicate the capture positions. Example areas with significant improvements are highlighted and zoomed in. More examples in Supplementary.
  • Figure 5: Overview of coarse scene initialization.
  • ...and 7 more figures