Table of Contents
Fetching ...

Bayesian Inverse Graphics for Few-Shot Concept Learning

Octavio Arriaga, Jichen Guo, Rebecca Adam, Sebastian Houben, Frank Kirchner

TL;DR

This work proposes a generative inverse graphics model of primitive shapes, to infer posterior distributions over physically consistent parameters from one or several images, and shows how this representation can be used for downstream tasks such as few-shot classification and pose estimation.

Abstract

Humans excel at building generalizations of new concepts from just one single example. Contrary to this, current computer vision models typically require large amount of training samples to achieve a comparable accuracy. In this work we present a Bayesian model of perception that learns using only minimal data, a prototypical probabilistic program of an object. Specifically, we propose a generative inverse graphics model of primitive shapes, to infer posterior distributions over physically consistent parameters from one or several images. We show how this representation can be used for downstream tasks such as few-shot classification and pose estimation. Our model outperforms existing few-shot neural-only classification algorithms and demonstrates generalization across varying lighting conditions, backgrounds, and out-of-distribution shapes. By design, our model is uncertainty-aware and uses our new differentiable renderer for optimizing global scene parameters through gradient descent, sampling posterior distributions over object parameters with Markov Chain Monte Carlo (MCMC), and using a neural based likelihood function.

Bayesian Inverse Graphics for Few-Shot Concept Learning

TL;DR

This work proposes a generative inverse graphics model of primitive shapes, to infer posterior distributions over physically consistent parameters from one or several images, and shows how this representation can be used for downstream tasks such as few-shot classification and pose estimation.

Abstract

Humans excel at building generalizations of new concepts from just one single example. Contrary to this, current computer vision models typically require large amount of training samples to achieve a comparable accuracy. In this work we present a Bayesian model of perception that learns using only minimal data, a prototypical probabilistic program of an object. Specifically, we propose a generative inverse graphics model of primitive shapes, to infer posterior distributions over physically consistent parameters from one or several images. We show how this representation can be used for downstream tasks such as few-shot classification and pose estimation. Our model outperforms existing few-shot neural-only classification algorithms and demonstrates generalization across varying lighting conditions, backgrounds, and out-of-distribution shapes. By design, our model is uncertainty-aware and uses our new differentiable renderer for optimizing global scene parameters through gradient descent, sampling posterior distributions over object parameters with Markov Chain Monte Carlo (MCMC), and using a neural based likelihood function.
Paper Structure (20 sections, 14 equations, 15 figures, 4 tables)

This paper contains 20 sections, 14 equations, 15 figures, 4 tables.

Figures (15)

  • Figure 1: Neuro-symbolic inverse graphics model for few-shot learning
  • Figure 2: Samples of our few-shot training datasets
  • Figure 3: Model based scene optimization
  • Figure 4: Model based scene optimization: Subfigure \ref{['fig:optimized_materials']} displays in each sphere one extracted material obtained from our optimization process. Moreover, one can observe how each material behaves differently under the same lighting conditions.
  • Figure 5: Inverse graphics model and prior predictive samples for FS-CLVR, FS-CLVR-dark, and the FS-CLVR-room datasets
  • ...and 10 more figures