Table of Contents
Fetching ...

Hybrid Models with Deep and Invertible Features

Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan

TL;DR

This work introduces the DIGLM, a neural hybrid that couples a deep invertible transform with a generalized linear model to jointly model p(x) and p(y|x) in a single forward pass. By leveraging invertible flows (RNVP/Glow) as feature extractors and sharing parameters with the predictive GLM, the model provides exact densities, enabling out-of-distribution detection and semi-supervised learning. Empirical results on regression and classification tasks show competitive predictive performance and strong uncertainty estimates, with notable improvements in negative log-likelihood when leveraging the generative component. The approach also bridges to Gaussian processes through a kernel defined by the latent representations, offering a practical probabilistic deep learning framework for downstream tasks that require density-based reasoning.

Abstract

We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i.e. a normalizing flow). An attractive property of our model is that both p(features), the density of the features, and p(targets | features), the predictive distribution, can be computed exactly in a single feed-forward pass. We show that our hybrid model, despite the invertibility constraints, achieves similar accuracy to purely predictive models. Moreover the generative component remains a good model of the input features despite the hybrid optimization objective. This offers additional capabilities such as detection of out-of-distribution inputs and enabling semi-supervised learning. The availability of the exact joint density p(targets, features) also allows us to compute many quantities readily, making our hybrid model a useful building block for downstream applications of probabilistic deep learning.

Hybrid Models with Deep and Invertible Features

TL;DR

This work introduces the DIGLM, a neural hybrid that couples a deep invertible transform with a generalized linear model to jointly model p(x) and p(y|x) in a single forward pass. By leveraging invertible flows (RNVP/Glow) as feature extractors and sharing parameters with the predictive GLM, the model provides exact densities, enabling out-of-distribution detection and semi-supervised learning. Empirical results on regression and classification tasks show competitive predictive performance and strong uncertainty estimates, with notable improvements in negative log-likelihood when leveraging the generative component. The approach also bridges to Gaussian processes through a kernel defined by the latent representations, offering a practical probabilistic deep learning framework for downstream tasks that require density-based reasoning.

Abstract

We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i.e. a normalizing flow). An attractive property of our model is that both p(features), the density of the features, and p(targets | features), the predictive distribution, can be computed exactly in a single feed-forward pass. We show that our hybrid model, despite the invertibility constraints, achieves similar accuracy to purely predictive models. Moreover the generative component remains a good model of the input features despite the hybrid optimization objective. This offers additional capabilities such as detection of out-of-distribution inputs and enabling semi-supervised learning. The availability of the exact joint density p(targets, features) also allows us to compute many quantities readily, making our hybrid model a useful building block for downstream applications of probabilistic deep learning.

Paper Structure

This paper contains 25 sections, 17 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Model Architecture. The diagram above shows the DIGLM's computational pipeline, which is comprised of a GLM stacked on top of an invertible generative model. The model parameters are ${\bm{\theta}}=\{{\bm{\phi}}, {\bm{\beta}}\}$ of which ${\bm{\phi}}$ is shared between the generative and predictive model, and ${\bm{\beta}}$ denotes parametrizes the GLM in the predictive model.
  • Figure 2: 1-dimensional Regression Task. We construct a toy regression task by sampling $x$-observations from a Gaussian mixture model and then assigning responses $y = x^{3} + \epsilon$ with $\epsilon$ being heteroscedastic noise. Subfigure (a) shows the function learned by a Gaussian process and (b) shows the function learned by the Bayesian DIGLM. Subfigure (c) shows the $p(x)$ density learned by the same DIGLM (black line) and compares it to a KDE (gray shading).
  • Figure 3: Histogram of $\log p({\bm{x}})$ on the flight delay data set. The leftward shift in the test set (red) shows that our DIGLM model is able to detect covariate shift.
  • Figure 4: Histogram of $\log p({\bm{x}})$ on classification experiments on MNIST. The hybrid model is able to successfully distinguish between in-distribution (MNIST) and OOD (NotMNIST) test inputs. Subfigure (c) shows latent space interpolations.
  • Figure 5: Subfigure (a) shows the histogram of $\log p({\bm{x}})$ on SVHN experiments. The hybrid model is able to successfully distinguish between in-distribution (SVHN) and OOD (CIFAR-10) test inputs. Subfigure (b) shows latent space interpolations. Subfigure (c) shows confidence versus accuracy plots and shows that the hybrid model is able to successfully reject OOD inputs.
  • ...and 1 more figures