Table of Contents
Fetching ...

Predicting Perceived Gloss: Do Weak Labels Suffice?

Julia Guerrero-Viu, J. Daniel Subias, Ana Serrano, Katherine R. Storrs, Roland W. Fleming, Belen Masia, Diego Gutierrez

TL;DR

This work shows how a much smaller set of human annotations (“strong labels”) can be effectively augmented with automatically derived “weak labels” in the context of learning a low‐dimensional image‐computable gloss metric.

Abstract

Estimating perceptual attributes of materials directly from images is a challenging task due to their complex, not fully-understood interactions with external factors, such as geometry and lighting. Supervised deep learning models have recently been shown to outperform traditional approaches, but rely on large datasets of human-annotated images for accurate perception predictions. Obtaining reliable annotations is a costly endeavor, aggravated by the limited ability of these models to generalise to different aspects of appearance. In this work, we show how a much smaller set of human annotations ("strong labels") can be effectively augmented with automatically derived "weak labels" in the context of learning a low-dimensional image-computable gloss metric. We evaluate three alternative weak labels for predicting human gloss perception from limited annotated data. Incorporating weak labels enhances our gloss prediction beyond the current state of the art. Moreover, it enables a substantial reduction in human annotation costs without sacrificing accuracy, whether working with rendered images or real photographs.

Predicting Perceived Gloss: Do Weak Labels Suffice?

TL;DR

This work shows how a much smaller set of human annotations (“strong labels”) can be effectively augmented with automatically derived “weak labels” in the context of learning a low‐dimensional image‐computable gloss metric.

Abstract

Estimating perceptual attributes of materials directly from images is a challenging task due to their complex, not fully-understood interactions with external factors, such as geometry and lighting. Supervised deep learning models have recently been shown to outperform traditional approaches, but rely on large datasets of human-annotated images for accurate perception predictions. Obtaining reliable annotations is a costly endeavor, aggravated by the limited ability of these models to generalise to different aspects of appearance. In this work, we show how a much smaller set of human annotations ("strong labels") can be effectively augmented with automatically derived "weak labels" in the context of learning a low-dimensional image-computable gloss metric. We evaluate three alternative weak labels for predicting human gloss perception from limited annotated data. Incorporating weak labels enhances our gloss prediction beyond the current state of the art. Moreover, it enables a substantial reduction in human annotation costs without sacrificing accuracy, whether working with rendered images or real photographs.
Paper Structure (20 sections, 2 equations, 9 figures, 6 tables)

This paper contains 20 sections, 2 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Example images from our training dataset. We extend the Serrano dataset serrano2021effect with a set of new images of varied geometries, illuminations and analytical materials using Disney's Principled BSDF.
  • Figure 2: Representative images of our controlled test dataset. Top row shows our five baseline measured materials under the baseline geometry frog and baseline illumination uffizi. Rows A-D show individual variations for one example material: A) different rotations, B) different geometry complexity by increasing bumpiness of the surface, C) different illuminations, and D) different levels of the specular parameter for the analytical fitting of the material.
  • Figure 3: Weak labels automatically computed for example images in our training dataset: based on the BSDF model (blue), image statistics (green) and industry metrics (orange). We show samples from the blob geometry under the cambridge illumination and grey albedo with different combinations of roughness and specular parameters from Disney's Principled BSDF. Although none of these labels are precise indicators of perceived gloss, they sufficiently correlate with gloss perception to be used as weak labels. All weak labels are in the range [1, 7].
  • Figure 4: Qualitative results of our gloss predictors on example images from our controlled test dataset. The numbers in the insets indicate, for every input image, the ground-truth judgement (GT) and the predictions from our models (from top to bottom): supervised model trained only with strong labels (S.100% only, second line), weakly supervised model with strong labels combined with our BSDF weak labels (S.100% + BSDF, third line) and weakly supervised model with a subset of the strong labels combined with our BSDF weak labels (S.20% + BSDF, fourth line). All gloss ratings are in the range $[1, 7]$.
  • Figure 5: Qualitative results of our weakly supervised gloss predictors when varying one confounding factor at a time on our test dataset. We show: variation across different rotations for the frog geometry with uffizi illumination and aluminium material (top), and variation across increasing specularity for the bumpy_sphere geometry with uffizi illumination and the Ward-Duer BRDF fitting of specular_yellow_phenolic material (bottom). The numbers in the insets indicate the ground-truth judgements (GT) and the gloss predictions from our weakly supervised models trained with BSDF weak labels jointly with the 100% of the Serrano dataset (S.100%+BSDF, second line) and the 20% of the Serrano dataset (20%+BSDF, third line). All gloss ratings are in the range $[1, 7]$.
  • ...and 4 more figures