DynaPix SLAM: A Pixel-Based Dynamic Visual SLAM Approach

Chenghao Xu; Elia Bonetto; Aamir Ahmad

DynaPix SLAM: A Pixel-Based Dynamic Visual SLAM Approach

Chenghao Xu, Elia Bonetto, Aamir Ahmad

TL;DR

DynaPix tackles dynamic scene challenges in V-SLAM with a semantic-free, pixel-wise motion probability framework that blends Movable region estimation from background differencing and Moving region estimation from flow differencing, producing a per-pixel probability P = D $\odot$ M. This probability weights feature points and map points within a modified ORB-SLAM2 backend, enabling a weighted BA and extended tracking by preserving informative points from moving regions. Extensive tests on GRADE and TUM-RGB-D show that DynaPix and its semi-semantic variant DynaPix+ achieve significantly lower trajectory errors and longer tracking times than baselines like ORB-SLAM2, DynaSLAM, DynamicVINS, and WF-SLAM. The results highlight the benefit of per-pixel, probability-based weighting over binary masks, with practical implications for robust indoor dynamic SLAM and potential online extensions in the future.

Abstract

Visual Simultaneous Localization and Mapping (V-SLAM) methods achieve remarkable performance in static environments, but face challenges in dynamic scenes where moving objects severely affect their core modules. To avoid this, dynamic V-SLAM approaches often leverage semantic information, geometric constraints, or optical flow. However, these methods are limited by imprecise estimations and their reliance on the accuracy of deep-learning models. Moreover, predefined thresholds for static/dynamic classification, the a-priori selection of dynamic object classes, and the inability to recognize unknown or unexpected moving objects, often degrade their performance. To address these limitations, we introduce DynaPix, a novel semantic-free V-SLAM system based on per-pixel motion probability estimation and an improved pose optimization process. The per-pixel motion probability is estimated using a static background differencing method on image data and optical flows computed on splatted frames. With DynaPix, we fully integrate these probabilities into map point selection and apply them through weighted bundle adjustment within the tracking and optimization modules of ORB-SLAM2. We thoroughly evaluate our method using the GRADE and TUM RGB-D datasets, showing significantly lower trajectory errors and longer tracking times in both static and dynamic sequences. The source code, datasets, and results are available at https://dynapix.is.tue.mpg.de/.

DynaPix SLAM: A Pixel-Based Dynamic Visual SLAM Approach

TL;DR

M. This probability weights feature points and map points within a modified ORB-SLAM2 backend, enabling a weighted BA and extended tracking by preserving informative points from moving regions. Extensive tests on GRADE and TUM-RGB-D show that DynaPix and its semi-semantic variant DynaPix+ achieve significantly lower trajectory errors and longer tracking times than baselines like ORB-SLAM2, DynaSLAM, DynamicVINS, and WF-SLAM. The results highlight the benefit of per-pixel, probability-based weighting over binary masks, with practical implications for robust indoor dynamic SLAM and potential online extensions in the future.

Abstract

Paper Structure (19 sections, 9 equations, 6 figures, 4 tables)

This paper contains 19 sections, 9 equations, 6 figures, 4 tables.

Introduction
Related Work
Moving Object Segmentation
Dynamic SLAM
Approach
Pixel-wise Motion Probability
Movable Region Estimation
Moving Region Estimation
Splatting-based View Synthesis
Flow Differencing
Motion Probability
Tracking and Pose Optimization
Local Tracking
Weighted Bundle Adjustment
Experiments & Analysis
...and 4 more sections

Figures (6)

Figure 1: DynaPix's motion probabilities on GRADE (left) and TUM RGB-D (right) frames. On the left side of each image we colored the estimated moving regions. On the right side, ORB features are colored based on the motion probabilities, from static (green) to dynamic (red).
Figure 1: Example of movable regions estimation.
Figure 2: The DynaPix architecture consists of two main blocks, the motion probability estimation (blue box), and the modified ORB-SLAM2 backend (green box). We compute movable (Sec. \ref{['sec:movable']}) and moving regions (Sec. \ref{['sec:moving']}) on the current frame. The per-pixel moving probabilities (Sec. \ref{['sec:final_pm']}) are then integrated into the SLAM backend (Sec. \ref{['sec:pose_opt']}).
Figure 2: Example of moving estimation between $\mathcal{I}^t$ and $\mathcal{I}^{t-1}$.
Figure 3: Frame differences between a reprojected frame $\widetilde{\mathcal{I}}^{t+i}$ and Frame $\mathcal{I}^{t}$ using homography (b) and softmax splatting (c) transformations. The noise, evident in (b), is definitely reduced in (c).
...and 1 more figures

DynaPix SLAM: A Pixel-Based Dynamic Visual SLAM Approach

TL;DR

Abstract

DynaPix SLAM: A Pixel-Based Dynamic Visual SLAM Approach

Authors

TL;DR

Abstract

Table of Contents

Figures (6)