Kernelised Normalising Flows

Eshant English; Matthias Kirchler; Christoph Lippert

Kernelised Normalising Flows

Eshant English, Matthias Kirchler, Christoph Lippert

TL;DR

This work presents Ferumal flow, a novel kernelised normalising flow paradigm that integrates kernels into the framework, demonstrating that a kernelised flow can yield competitive or superior results compared to neural network-based flows whilst maintaining parameter efficiency.

Abstract

Normalising Flows are non-parametric statistical models characterised by their dual capabilities of density estimation and generation. This duality requires an inherently invertible architecture. However, the requirement of invertibility imposes constraints on their expressiveness, necessitating a large number of parameters and innovative architectural designs to achieve good results. Whilst flow-based models predominantly rely on neural-network-based transformations for expressive designs, alternative transformation methods have received limited attention. In this work, we present Ferumal flow, a novel kernelised normalising flow paradigm that integrates kernels into the framework. Our results demonstrate that a kernelised flow can yield competitive or superior results compared to neural network-based flows whilst maintaining parameter efficiency. Kernelised flows excel especially in the low-data regime, enabling flexible non-parametric density estimation in applications with sparse data availability.

Kernelised Normalising Flows

TL;DR

Abstract

Paper Structure (35 sections, 1 theorem, 17 equations, 7 figures, 12 tables)

This paper contains 35 sections, 1 theorem, 17 equations, 7 figures, 12 tables.

Introduction
Background
Maximum likelihood optimisation with normalising flows
Coupling layers
Kernel machines
Ferumal flows: kernelisation of flow-based architectures
Kernelised coupling layers
Efficient learning with auxiliary points
Related works
Experiments
Implementation details
Training details
2D toy datasets
Real-world datasets
Initial performance
...and 20 more sections

Key Result

Proposition 3.1

Given the objective $L$ in equation kernel-obj, for any $V'=\left[V_{1}', \ldots, V_L'\right] \in \mathcal{H}^{L(D-d)}$ there also exists a $V$ with $L(V) =L\left(V'\right)$ such that for some $A_{\ell,i} = \left[A_{\ell, i}^s, A_{\ell, i}^t\right] \in \mathbb{R}^{2(D-d)}$. Here, $u_{\ell,i}^1 = \pi_\ell(C_{\ell-1} \circ \dots \circ C_{1}(x_i))_{1:d}$, i.e., the first part of the permutated inpu

Figures (7)

Figure 1: Histogram of 2D toy datasets. Left: True distribution. Middle: NN-based. Right: FF-kernelisation
Figure 2: Discontinuous distributions. Shown are training data (left column), flow density (center column), and and histogram of flow samples (right column)
Figure 3: Negative log-likelihood (loss in nats) on training and test sets over the first 3,000 training iterations. Methods prepended with FF are our kernelised versions. All models were further trained until convergence.
Figure 4: Sampled images from MixerFlow
Figure 5: Original samples after binarisation
...and 2 more figures

Theorems & Definitions (2)

Proposition 3.1
proof

Kernelised Normalising Flows

TL;DR

Abstract

Kernelised Normalising Flows

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (2)