C-RNN-GAN: Continuous recurrent neural networks with adversarial training
Olof Mogren
TL;DR
The paper introduces C-RNN-GAN, a continuous-sequence GAN with a generator that outputs real-valued tone events and a bidirectional LSTM discriminator, trained to model the joint distribution of musical sequences. It demonstrates that adversarial training increases variability and tonal spread in generated classical music, and that allowing multiple tones per step enhances polyphony (notably in the 3-tone variant). While generated samples move closer to real music on several statistics compared to a maximum-likelihood baseline, they do not yet match human judgments of realism. The work provides a foundation for applying adversarial training to continuous sequential data and highlights the importance of stabilization techniques and multi-tone outputs in improving musicality.
Abstract
Generative adversarial networks have been proposed as a way of efficiently training deep generative neural networks. We propose a generative adversarial model that works on continuous sequential data, and apply it by training it on a collection of classical music. We conclude that it generates music that sounds better and better as the model is trained, report statistics on generated music, and let the reader judge the quality by downloading the generated songs.
