Piecewise Normalizing Flows
Harry Bevins, Will Handley, Thomas Gessey-Jones
TL;DR
The paper tackles the difficulty of modeling multi-modal distributions with normalizing flows by introducing piecewise normalizing flows (PNFs), which cluster the target distribution and train separate Gaussian-base NFs on each cluster. This segmentation aligns each piece's topology with the base distribution, enabling parallel training and often improving emulation accuracy versus single-flow and resampled-base baselines. Benchmark results on toy multimodal distributions show PNFs achieve lower KL divergences than competing methods, though gains on some real-world datasets are data-dependent. The approach offers a practical, scalable path to better capture complex densities while highlighting considerations around clustering choices and limitations of the method.
Abstract
Normalizing flows are an established approach for modelling complex probability densities through invertible transformations from a base distribution. However, the accuracy with which the target distribution can be captured by the normalizing flow is strongly influenced by the topology of the base distribution. A mismatch between the topology of the target and the base can result in a poor performance, as is typically the case for multi-modal problems. A number of different works have attempted to modify the topology of the base distribution to better match the target, either through the use of Gaussian Mixture Models (Izmailov et al., 2020; Ardizzone et al., 2020; Hagemann & Neumayer, 2021) or learned accept/reject sampling (Stimper et al., 2022). We introduce piecewise normalizing flows which divide the target distribution into clusters, with topologies that better match the standard normal base distribution, and train a series of flows to model complex multi-modal targets. We demonstrate the performance of the piecewise flows using some standard benchmarks and compare the accuracy of the flows to the approach taken in Stimper et al. (2022) for modelling multi-modal distributions. We find that our approach consistently outperforms the approach in Stimper et al. (2022) with a higher emulation accuracy on the standard benchmarks.
