Table of Contents
Fetching ...

ReMatching Dynamic Reconstruction Flow

Sara Oblak, Despoina Paschalidou, Sanja Fidler, Matan Atzmon

TL;DR

The paper introduces ReMatching, a framework to integrate deformation priors into dynamic scene reconstruction by leveraging velocity-field priors. It formulates a flow-matching objective that projects the time-varying reconstruction onto a prior class via a continuity-equation-based loss, yielding a ReMatching loss that co-tunes with standard reconstruction losses. The method supports multiple prior classes (directional, rigid, volume-preserving) and can adaptively combine them, even using learnable part weights, while remaining computationally efficient through linear-algebra solutions. Evaluations on synthetic and real datasets show consistent improvements in reconstruction fidelity and temporal coherence, indicating strong generalization to unseen viewpoints and timestamps. This framework offers a flexible, scalable path to strengthened dynamic reconstructions across diverse representation schemes, with practical implications for improved 3D motion capture and rendering pipelines.

Abstract

Reconstructing a dynamic scene from image inputs is a fundamental computer vision task with many downstream applications. Despite recent advancements, existing approaches still struggle to achieve high-quality reconstructions from unseen viewpoints and timestamps. This work introduces the ReMatching framework, designed to improve reconstruction quality by incorporating deformation priors into dynamic reconstruction models. Our approach advocates for velocity-field based priors, for which we suggest a matching procedure that can seamlessly supplement existing dynamic reconstruction pipelines. The framework is highly adaptable and can be applied to various dynamic representations. Moreover, it supports integrating multiple types of model priors and enables combining simpler ones to create more complex classes. Our evaluations on popular benchmarks involving both synthetic and real-world dynamic scenes demonstrate that augmenting current state-of-the-art methods with our approach leads to a clear improvement in reconstruction accuracy.

ReMatching Dynamic Reconstruction Flow

TL;DR

The paper introduces ReMatching, a framework to integrate deformation priors into dynamic scene reconstruction by leveraging velocity-field priors. It formulates a flow-matching objective that projects the time-varying reconstruction onto a prior class via a continuity-equation-based loss, yielding a ReMatching loss that co-tunes with standard reconstruction losses. The method supports multiple prior classes (directional, rigid, volume-preserving) and can adaptively combine them, even using learnable part weights, while remaining computationally efficient through linear-algebra solutions. Evaluations on synthetic and real datasets show consistent improvements in reconstruction fidelity and temporal coherence, indicating strong generalization to unseen viewpoints and timestamps. This framework offers a flexible, scalable path to strengthened dynamic reconstructions across diverse representation schemes, with practical implications for improved 3D motion capture and rendering pipelines.

Abstract

Reconstructing a dynamic scene from image inputs is a fundamental computer vision task with many downstream applications. Despite recent advancements, existing approaches still struggle to achieve high-quality reconstructions from unseen viewpoints and timestamps. This work introduces the ReMatching framework, designed to improve reconstruction quality by incorporating deformation priors into dynamic reconstruction models. Our approach advocates for velocity-field based priors, for which we suggest a matching procedure that can seamlessly supplement existing dynamic reconstruction pipelines. The framework is highly adaptable and can be applied to various dynamic representations. Moreover, it supports integrating multiple types of model priors and enables combining simpler ones to create more complex classes. Our evaluations on popular benchmarks involving both synthetic and real-world dynamic scenes demonstrate that augmenting current state-of-the-art methods with our approach leads to a clear improvement in reconstruction accuracy.

Paper Structure

This paper contains 44 sections, 1 theorem, 77 equations, 22 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

For the prior class $\mathcal{P}_{\@slowromancap iv@}$, the solutions $({\bm{A}}_{jt},{\bm{b}}_{jt})$ to the minimization problem eq:piecewise_rigid are given by, where $\bm{\Gamma}_{jt}=\left[\gamma^{1}_t,\cdots,\gamma^{n}_t\right]^T \in \mathbb R^{n\times d}$, $\dot{\bm{\Gamma}}_t = \left[\frac{d}{dt}\gamma^{1}_t,\cdots,\frac{d}{dt}\gamma^{n}_t\right]^T \in \mathbb R^{n\times d}$, ${\bm{W}}_{jt

Figures (22)

  • Figure 1: A vector field in $\mathcal{P}_{\@slowromancap v@}$.
  • Figure 2: Qualitative comparison of baselines and our model on the D-NeRF dataset pumarola2021d. We note that our framework consistently produces high fidelity reconstructions, accurately capturing fine-grained details, as highlighted in the blue boxes.
  • Figure 3: Qualitative comparison of our method to D3G yang2023deformable on the HyperNeRF dataset park2021hypernerf. Our framework yields more accurate reconstructions, in particular around moving parts.
  • Figure 4: Part assignments for the adaptive-combination prior class.
  • Figure 5: Illustration of the architecture for $\Psi_t$ used in the experiments, based on yang2023deformable. Reference Gaussians parameters are propagated to time $t$ through a shared function, $\psi_t^1$, implemented as an MLP with positional encoding features to compute time-varying point features of dimension $d_{\textrm{f}}$. These features are then processed by a second shared function, $\psi_t^2$, to generate time-varying Gaussians parameters. Finally, given a chosen viewing direction, the Gaussian Splatting rendering model is used to produce a rendered image.
  • ...and 17 more figures

Theorems & Definitions (2)

  • Lemma 1
  • proof