Learning Likelihoods with Conditional Normalizing Flows
Christina Winkler, Daniel Worrall, Emiel Hoogeboom, Max Welling
TL;DR
Conditional normalizing flows are proposed to model p_Y|X for high-dimensional outputs, capturing inter-output correlations and multimodality without hand-designed per-pixel losses. The method conditions both the base density and the invertible mapping on X, enabling efficient sampling and exact likelihoods, and is trained in z-space. Experiments on super-resolution and retinal vessel segmentation show competitive likelihoods and traditional metrics, with qualitative evidence of crisper details and calibrated probability estimates. Overall, CNFs offer a flexible, probabilistic framework for structured prediction that avoids mode collapse and training instability typical of some alternative approaches.
Abstract
Normalizing Flows (NFs) are able to model complicated distributions p(y) with strong inter-dimensional correlations and high multimodality by transforming a simple base density p(z) through an invertible neural network under the change of variables formula. Such behavior is desirable in multivariate structured prediction tasks, where handcrafted per-pixel loss-based methods inadequately capture strong correlations between output dimensions. We present a study of conditional normalizing flows (CNFs), a class of NFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(y|x). CNFs are efficient in sampling and inference, they can be trained with a likelihood-based objective, and CNFs, being generative flows, do not suffer from mode collapse or training instabilities. We provide an effective method to train continuous CNFs for binary problems and in particular, we apply these CNFs to super-resolution and vessel segmentation tasks demonstrating competitive performance on standard benchmark datasets in terms of likelihood and conventional metrics.
