Table of Contents
Fetching ...

VICAN: Very Efficient Calibration Algorithm for Large Camera Networks

Gabriel Moreira, Manuel Marques, João Paulo Costeira, Alexander Hauptmann

TL;DR

VICAN introduces a bipartite camera–object pose graph to leverage dynamic object observations for scalable, accurate calibration of large camera networks. The approach casts camera localization as a maximum-likelihood problem and solves it with a tailored primal–dual algorithm that decouples rotations and translations, using rotation synchronization and a least-squares translation step. A new indoor dataset demonstrates robust performance across varying network sizes, with mean rotation errors around 0.04–0.07 deg and mean translation errors from millimeters to centimeters, improving as more object poses are incorporated. The work delivers a practical, fast, and scalable calibration framework suited for applications such as smart retail and surveillance, where occlusions and texture-poor regions hinder traditional camera-only pose estimation.

Abstract

The precise estimation of camera poses within large camera networks is a foundational problem in computer vision and robotics, with broad applications spanning autonomous navigation, surveillance, and augmented reality. In this paper, we introduce a novel methodology that extends state-of-the-art Pose Graph Optimization (PGO) techniques. Departing from the conventional PGO paradigm, which primarily relies on camera-camera edges, our approach centers on the introduction of a dynamic element - any rigid object free to move in the scene - whose pose can be reliably inferred from a single image. Specifically, we consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step. This shift not only offers a solution to the challenges encountered in directly estimating relative poses between cameras, particularly in adverse environments, but also leverages the inclusion of numerous object poses to ameliorate and integrate errors, resulting in accurate camera pose estimates. Though our framework retains compatibility with traditional PGO solvers, its efficacy benefits from a custom-tailored optimization scheme. To this end, we introduce an iterative primal-dual algorithm, capable of handling large graphs. Empirical benchmarks, conducted on a new dataset of simulated indoor environments, substantiate the efficacy and efficiency of our approach.

VICAN: Very Efficient Calibration Algorithm for Large Camera Networks

TL;DR

VICAN introduces a bipartite camera–object pose graph to leverage dynamic object observations for scalable, accurate calibration of large camera networks. The approach casts camera localization as a maximum-likelihood problem and solves it with a tailored primal–dual algorithm that decouples rotations and translations, using rotation synchronization and a least-squares translation step. A new indoor dataset demonstrates robust performance across varying network sizes, with mean rotation errors around 0.04–0.07 deg and mean translation errors from millimeters to centimeters, improving as more object poses are incorporated. The work delivers a practical, fast, and scalable calibration framework suited for applications such as smart retail and surveillance, where occlusions and texture-poor regions hinder traditional camera-only pose estimation.

Abstract

The precise estimation of camera poses within large camera networks is a foundational problem in computer vision and robotics, with broad applications spanning autonomous navigation, surveillance, and augmented reality. In this paper, we introduce a novel methodology that extends state-of-the-art Pose Graph Optimization (PGO) techniques. Departing from the conventional PGO paradigm, which primarily relies on camera-camera edges, our approach centers on the introduction of a dynamic element - any rigid object free to move in the scene - whose pose can be reliably inferred from a single image. Specifically, we consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step. This shift not only offers a solution to the challenges encountered in directly estimating relative poses between cameras, particularly in adverse environments, but also leverages the inclusion of numerous object poses to ameliorate and integrate errors, resulting in accurate camera pose estimates. Though our framework retains compatibility with traditional PGO solvers, its efficacy benefits from a custom-tailored optimization scheme. To this end, we introduce an iterative primal-dual algorithm, capable of handling large graphs. Empirical benchmarks, conducted on a new dataset of simulated indoor environments, substantiate the efficacy and efficiency of our approach.
Paper Structure (17 sections, 1 theorem, 46 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 17 sections, 1 theorem, 46 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Assume strong duality holds and denote by $(\mathbf{\Lambda}^\ast, \mathbf{R}^\ast)$ a primal-dual optimal pair of Problem (eq:rotation_syncronization_problem), where the dual variable $\mathbf{\Lambda}$ is decomposed as with $\mathbf{\Lambda}_{\mathcal{C}}\in\mathbb{R}^{3C\times 3C}$ and $\mathbf{\Lambda}_{\mathcal{T}}\in\mathbb{R}^{3T\times 3T}$. The optimal camera poses $\mathbf{R}_\mathcal{C}

Figures (3)

  • Figure 2: Standard PGO (left) vs our augmentation with object nodes (right). Pairwise relative transformations are shown as $\tilde{\mathbf{P}}_{\cdot,\cdot}$. The $i$-th camera node is $c_i$, and $m_i^{(t)}$ is the $i$-th object node at time $t$.
  • Figure 3: Small room scene: array composed of 25 cameras mounted on the ceiling of a 72m$^2$ room. Left: image examples; Middle: 3D model of the room; Right: top-view of the room with camera locations.
  • Figure 4: Cube with side 0.575m covered in 24 ArUco markers with side 0.276m, used as the dynamic object in the bipartite pose graph.

Theorems & Definitions (2)

  • Theorem 1
  • proof