Latent Normalizing Flows for Discrete Sequences

Zachary M. Ziegler; Alexander M. Rush

Latent Normalizing Flows for Discrete Sequences

Zachary M. Ziegler, Alexander M. Rush

TL;DR

This work tackles the challenge of applying normalizing flows to discrete sequences by embedding a highly multimodal flow-based prior inside a VAE and emitting discrete observations from a simple, inputless decoder. It introduces three flow architectures (AF/AF, AF/SCF, IAF/SCF) and an extension with Non-Linear Squared (NLSq) flows to capture multimodal dynamics essential for discrete data. Experiments on character-level language modeling and polyphonic music modeling show that the latent-flow approach can approach autoregressive baselines while enabling non-autoregressive generation with speedups, albeit with some trade-offs in accuracy. The results highlight the potential of continuous latent representations to model discrete sequences and point to future directions in conditional and GAN-integrated frameworks.

Abstract

Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.

Latent Normalizing Flows for Discrete Sequences

TL;DR

Abstract

Latent Normalizing Flows for Discrete Sequences

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)