Table of Contents
Fetching ...

PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus

Florian Kluger, Bodo Rosenhahn

TL;DR

A real-time method for robust estimation of multiple instances of geometric models from noisy data using a neural network to determine the model parameters for each potential instance separately in a RANSAC-like fashion.

Abstract

We present a real-time method for robust estimation of multiple instances of geometric models from noisy data. Geometric models such as vanishing points, planar homographies or fundamental matrices are essential for 3D scene analysis. Previous approaches discover distinct model instances in an iterative manner, thus limiting their potential for speedup via parallel computation. In contrast, our method detects all model instances independently and in parallel. A neural network segments the input data into clusters representing potential model instances by predicting multiple sets of sample and inlier weights. Using the predicted weights, we determine the model parameters for each potential instance separately in a RANSAC-like fashion. We train the neural network via task-specific loss functions, i.e. we do not require a ground-truth segmentation of the input data. As suitable training data for homography and fundamental matrix fitting is scarce, we additionally present two new synthetic datasets. We demonstrate state-of-the-art performance on these as well as multiple established datasets, with inference times as small as five milliseconds per image.

PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus

TL;DR

A real-time method for robust estimation of multiple instances of geometric models from noisy data using a neural network to determine the model parameters for each potential instance separately in a RANSAC-like fashion.

Abstract

We present a real-time method for robust estimation of multiple instances of geometric models from noisy data. Geometric models such as vanishing points, planar homographies or fundamental matrices are essential for 3D scene analysis. Previous approaches discover distinct model instances in an iterative manner, thus limiting their potential for speedup via parallel computation. In contrast, our method detects all model instances independently and in parallel. A neural network segments the input data into clusters representing potential model instances by predicting multiple sets of sample and inlier weights. Using the predicted weights, we determine the model parameters for each potential instance separately in a RANSAC-like fashion. We train the neural network via task-specific loss functions, i.e. we do not require a ground-truth segmentation of the input data. As suitable training data for homography and fundamental matrix fitting is scarce, we additionally present two new synthetic datasets. We demonstrate state-of-the-art performance on these as well as multiple established datasets, with inference times as small as five milliseconds per image.
Paper Structure (60 sections, 33 equations, 21 figures, 9 tables, 3 algorithms)

This paper contains 60 sections, 33 equations, 21 figures, 9 tables, 3 algorithms.

Figures (21)

  • Figure 1: Applications: PARSAC estimates multiple vanishing points (V, top), fundamental matrices (F, middle) or homographies (H, bottom). We visualise distinct model instances using different colour hues. Brightness in columns three and four is proportional to the corresponding weight.
  • Figure 2: PARSAC Overview: Given observations $\mathcal{X}$, e.g. line segments or point correspondences, we predict sample weights $p$ and inlier weights $q$ for each observation and putative geometric model using a neural network. For each putative geometric model $j$, we independently sample model hypotheses in a RANSAC-like fashion, using the predicted sample weights. We then select the best models which have the largest weighted inlier counts, using the predicted inlier weights. An additional set of weights captures potential outliers.
  • Figure 3: We propose two new datasets: HOPE-F for fundamental matrix fitting and SMH for homography fitting. For each dataset we show one example image pair: the left pair of each example shows the RGB images, the right pair visualises the pre-computed SIFT key-points, colour coded by ground truth label. Please refer to the appendix for additional examples.
  • Figure 4: Neural network architecture: We feed observations $\mathbf{x} \in \mathcal{X}$ of dimension $D$ into our network as a tensor of size $N \times 1 \times D$. The network consists of linear $1\times 1$ convolutional layers interleaved with instance normalisation ulyanov2016instance, batch normalisation ioffe2015batch and ReLU he2015delving layers which are arranged as residual blocks he2016deep. The architecture is based on kluger2020consac.
  • Figure 5: Robustness to Noise: We add Gaussian noise with varying standard deviation $\sigma$ to our input features and evaluate the AUC @ $5\degree$ (higher is better) for vanishing point estimation, as well as the misclassification error (ME, lower is better) for fundamental matrix and homography estimation. We compare PARSAC against Progressive-X, Progressive-X+, T-Linkage and CONSAC.
  • ...and 16 more figures