GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution
Matt J. Kusner, José Miguel Hernández-Lobato
TL;DR
The paper tackles the difficulty of applying GANs to sequences of discrete elements by adopting the Gumbel-softmax as a differentiable approximation for sampling from a multinomial. It deploys an LSTM-based generator and discriminator and trains them with an adversarial procedure that leverages differentiable sampling to backpropagate through discrete outputs. Through experiments on a context-free grammar task, the authors show that Gumbel-softmax-enabled GANs can produce realistic discrete sequences and demonstrate the effect of temperature annealing on training. They suggest future directions such as variational divergence minimization and density ratio estimation to further improve performance on discrete data.
Abstract
Generative Adversarial Networks (GAN) have limitations when the goal is to generate sequences of discrete elements. The reason for this is that samples from a distribution on discrete objects such as the multinomial are not differentiable with respect to the distribution parameters. This problem can be avoided by using the Gumbel-softmax distribution, which is a continuous approximation to a multinomial distribution parameterized in terms of the softmax function. In this work, we evaluate the performance of GANs based on recurrent neural networks with Gumbel-softmax output distributions in the task of generating sequences of discrete elements.
