Table of Contents
Fetching ...

Mobile Robotic Multi-View Photometric Stereo

Suryansh Kumar

TL;DR

The paper addresses the limitation of traditional MVPS requiring fixed laboratory setups by introducing a portable mobile robotic MVPS system. It proposes an online incremental pipeline that jointly predicts light directions and intensities, per-pixel surface normals with uncertainty, and per-view depth priors, then refines depth via an uncertainty-guided optimization and fuses frames with online TSDF fusion. On the DiLiGenT-MV benchmark, using only 36 viewpoints with 8 PS images per view (total 288 images), the method achieves competitive geometry with state-of-the-art methods that use far more data and demonstrates over $100\times$ computational efficiency. This work enables automated, fine-grained 3D photogrammetry for mobile robotics, reducing calibration burdens and enabling flexible data acquisition in unconstrained settings.

Abstract

Multi-View Photometric Stereo (MVPS) is a popular method for fine-detailed 3D acquisition of an object from images. Despite its outstanding results on diverse material objects, a typical MVPS experimental setup requires a well-calibrated light source and a monocular camera installed on an immovable base. This restricts the use of MVPS on a movable platform, limiting us from taking MVPS benefits in 3D acquisition for mobile robotics applications. To this end, we introduce a new mobile robotic system for MVPS. While the proposed system brings advantages, it introduces additional algorithmic challenges. Addressing them, in this paper, we further propose an incremental approach for mobile robotic MVPS. Our approach leverages a supervised learning setup to predict per-view surface normal, object depth, and per-pixel uncertainty in model-predicted results. A refined depth map per view is obtained by solving an MVPS-driven optimization problem proposed in this paper. Later, we fuse the refined depth map while tracking the camera pose w.r.t the reference frame to recover globally consistent object 3D geometry. Experimental results show the advantages of our robotic system and algorithm, featuring the local high-frequency surface detail recovery with globally consistent object shape. Our work is beyond any MVPS system yet presented, providing encouraging results on objects with unknown reflectance properties using fewer frames without a tiring calibration and installation process, enabling computationally efficient robotic automation approach to photogrammetry. The proposed approach is nearly 100 times computationally faster than the state-of-the-art MVPS methods such as [1, 2] while maintaining the similar results when tested on subjects taken from the benchmark DiLiGenT MV dataset [3].

Mobile Robotic Multi-View Photometric Stereo

TL;DR

The paper addresses the limitation of traditional MVPS requiring fixed laboratory setups by introducing a portable mobile robotic MVPS system. It proposes an online incremental pipeline that jointly predicts light directions and intensities, per-pixel surface normals with uncertainty, and per-view depth priors, then refines depth via an uncertainty-guided optimization and fuses frames with online TSDF fusion. On the DiLiGenT-MV benchmark, using only 36 viewpoints with 8 PS images per view (total 288 images), the method achieves competitive geometry with state-of-the-art methods that use far more data and demonstrates over computational efficiency. This work enables automated, fine-grained 3D photogrammetry for mobile robotics, reducing calibration burdens and enabling flexible data acquisition in unconstrained settings.

Abstract

Multi-View Photometric Stereo (MVPS) is a popular method for fine-detailed 3D acquisition of an object from images. Despite its outstanding results on diverse material objects, a typical MVPS experimental setup requires a well-calibrated light source and a monocular camera installed on an immovable base. This restricts the use of MVPS on a movable platform, limiting us from taking MVPS benefits in 3D acquisition for mobile robotics applications. To this end, we introduce a new mobile robotic system for MVPS. While the proposed system brings advantages, it introduces additional algorithmic challenges. Addressing them, in this paper, we further propose an incremental approach for mobile robotic MVPS. Our approach leverages a supervised learning setup to predict per-view surface normal, object depth, and per-pixel uncertainty in model-predicted results. A refined depth map per view is obtained by solving an MVPS-driven optimization problem proposed in this paper. Later, we fuse the refined depth map while tracking the camera pose w.r.t the reference frame to recover globally consistent object 3D geometry. Experimental results show the advantages of our robotic system and algorithm, featuring the local high-frequency surface detail recovery with globally consistent object shape. Our work is beyond any MVPS system yet presented, providing encouraging results on objects with unknown reflectance properties using fewer frames without a tiring calibration and installation process, enabling computationally efficient robotic automation approach to photogrammetry. The proposed approach is nearly 100 times computationally faster than the state-of-the-art MVPS methods such as [1, 2] while maintaining the similar results when tested on subjects taken from the benchmark DiLiGenT MV dataset [3].

Paper Structure

This paper contains 17 sections, 5 equations, 14 figures, 5 tables, 1 algorithm.

Figures (14)

  • Figure 1: (a) Popular multiview photometric stereo setup as shown in kaya2021neural, where the hardware is installed on an immovable base with a subject placed on a turn table. (b) Our mobile robotic MVPS setup, where the robot is allowed to move for object's 3D acquisition. (c) 3D acquisition results using images of tooth model acquired using our system.
  • Figure 2: Our mobile robotic test time setup. (a) Our mobile robot moves around the object at test time, performing 3D data acquisition. (b) The robot's ground truth and recovered camera pose trajectory are shown in red and green, respectively. (c) side view of the recovered 3D data compared to ground truth shown in millimeters (mm) along chosen geodesic (shown with a red line on BUDDHA image).
  • Figure 3: System Overview. Our robotic MVPS setup captures $\mathcal{I}_\textrm{ps}^{t}$ and infers all the light source direction $(\{l_{k}^{t}\}_{k=1}^8)$ and intensity $(\{e_{k}^{t}\}_{k=1}^8)$ followed by ${I}^{t}_\textrm{si}$ computation for that view-point. The recovered light source parameters are then used to predict surface normal map $N_\textrm{ps}^t$ and associated uncertainty $\Lambda_\textrm{ps}^t$ with the predicted values. In parallel, the depth map corresponding to ${I}^{t}_\textrm{si}$ is predicted using the pre-trained SIDP model. Later, a refined depth map is computed by solving Eq.\ref{['eq:overall_optimization_per_view']}, which is fused to the global TSDF volume representation over frames for object 3D acquisition.
  • Figure 4: Comparison with the state of the art MVPS methods. We compared our methods to R-MVPS park2016robust, NR-MVPS kaya2021neural, B-MVPS li2020multi, UA-MVPS kaya2022uncertainty, and MVPS-Rev kaya2023multi. Despite our setting being different from the existing MVPS, we recover the object's 3D geometry comparable to these methods, showing its suitability for the next step in MVPS, i.e., mobile robotic automation in photogrammetry for fine-detailed 3D acquisition of objects.
  • Figure 5: (a)-(b) Camera rotation and translation error over frames for each object category after registration, respectively.
  • ...and 9 more figures