Table of Contents
Fetching ...

Image Gradient-Aided Photometric Stereo Network

Kaixuan Wang, Lin Qi, Shiyu Qin, Kai Luo, Yakun Ju, Xia Li, Junyu Dong

TL;DR

The paper tackles the challenge of recovering accurate surface normals in high-frequency regions for photometric stereo under non-Lambertian conditions. It introduces IGA-PSN, a dual-branch network that fuses learned features from normalized images and their image gradients through an attention-based cross-information fusion, and employs an hourglass regressor with multi-level supervision. A gradient loss complements the cosine similarity objective to preserve sharp geometric details, yielding state-of-the-art performance on the DiLiGenT benchmark with a mean angular error of around $6.46\circ$. The approach demonstrates strong performance in complex regions while maintaining texture fidelity, and the authors validate the design via ablations, showing the critical roles of gradient guidance, fusion, and multi-scale regression. These findings highlight gradient-guided PS as a promising direction for high-frequency detail preservation in surface normal estimation.

Abstract

Photometric stereo (PS) endeavors to ascertain surface normals using shading clues from photometric images under various illuminations. Recent deep learning-based PS methods often overlook the complexity of object surfaces. These neural network models, which exclusively rely on photometric images for training, often produce blurred results in high-frequency regions characterized by local discontinuities, such as wrinkles and edges with significant gradient changes. To address this, we propose the Image Gradient-Aided Photometric Stereo Network (IGA-PSN), a dual-branch framework extracting features from both photometric images and their gradients. Furthermore, we incorporate an hourglass regression network along with supervision to regularize normal regression. Experiments on DiLiGenT benchmarks show that IGA-PSN outperforms previous methods in surface normal estimation, achieving a mean angular error of 6.46 while preserving textures and geometric shapes in complex regions.

Image Gradient-Aided Photometric Stereo Network

TL;DR

The paper tackles the challenge of recovering accurate surface normals in high-frequency regions for photometric stereo under non-Lambertian conditions. It introduces IGA-PSN, a dual-branch network that fuses learned features from normalized images and their image gradients through an attention-based cross-information fusion, and employs an hourglass regressor with multi-level supervision. A gradient loss complements the cosine similarity objective to preserve sharp geometric details, yielding state-of-the-art performance on the DiLiGenT benchmark with a mean angular error of around . The approach demonstrates strong performance in complex regions while maintaining texture fidelity, and the authors validate the design via ablations, showing the critical roles of gradient guidance, fusion, and multi-scale regression. These findings highlight gradient-guided PS as a promising direction for high-frequency detail preservation in surface normal estimation.

Abstract

Photometric stereo (PS) endeavors to ascertain surface normals using shading clues from photometric images under various illuminations. Recent deep learning-based PS methods often overlook the complexity of object surfaces. These neural network models, which exclusively rely on photometric images for training, often produce blurred results in high-frequency regions characterized by local discontinuities, such as wrinkles and edges with significant gradient changes. To address this, we propose the Image Gradient-Aided Photometric Stereo Network (IGA-PSN), a dual-branch framework extracting features from both photometric images and their gradients. Furthermore, we incorporate an hourglass regression network along with supervision to regularize normal regression. Experiments on DiLiGenT benchmarks show that IGA-PSN outperforms previous methods in surface normal estimation, achieving a mean angular error of 6.46 while preserving textures and geometric shapes in complex regions.

Paper Structure

This paper contains 9 sections, 9 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: An example of the errors in complex-structured regions. We visualized the gradient map of the high-frequency region. The results compare our method with the NA-PSN 26_ju2022normattention, PX-Net 28_logothetis2021px and MF-PSN 16_liu2022deep.
  • Figure 2: The overview of IGA-PSN, which is composed of a shared-weight feature extraction subnetwork, cross-information fusion layer, and hourglass normal regression subnetwork.
  • Figure 3: Network details of feature/gradient extractor and normal regression subnetwork. The numbers below features indicate the dimension of the feature channel.
  • Figure 4: Comparing the convergence of the models. The blue line represents our IGA-PSN, while the red line corresponds to a model using a single cosine similarity loss. Both models are trained with the same architecture over 50 epochs. The model optimized with both two loss functions shows a lower convergence error than the model using only cosine similarity loss. This shows the effectiveness of adding gradient error loss.
  • Figure 5: Quantitative results on objects "Harvest","Reading" and "Pot2" on the DiLiGenT benchmark 11_shi2016benchmark with 96 input images. Numbers below the normal map are the MAE in degrees. Compared with NA-PSN 26_ju2022normattention, PX-Net 28_logothetis2021px, MF-PSN 16_liu2022deep, CNN-PS 4_ikehata2018cnn and PS-FCN(N.) 15_chen2020deep, our model achieves the best or sub-optimal results.