Learning and composing of classical music using restricted Boltzmann machines
Mutsumi Kobayashi, Hiroshi Watanabe
TL;DR
The paper investigates how a simple, transparent RBM can learn to compose music from piano-roll representations, aiming to shed light on internal representations rather than optimizing performance. By training on Bach piano-rolls and evaluating reconstruction, energy, and generation capabilities, the study shows the model can produce musically structured pieces but encodes information in a way not directly aligned with conventional music theory. Through targeted analyses (including t-SNE of hidden units and transposition tests), the work highlights how absolute pitch information can dominate learned representations and discusses interpretability limits of generative models. The results contribute to understanding the trade-off between simplicity, interpretability, and creative capability in AI-driven music generation, and propose avenues for more translational invariance and broader corpora.
Abstract
We investigate how machine learning models acquire the ability to compose music and how musical information is internally represented within such models. We develop a composition algorithm based on a restricted Boltzmann machine (RBM), a simple generative model capable of producing musical pieces of arbitrary length. We convert musical scores into piano-roll image representations and train the RBM in an unsupervised manner. We confirm that the trained RBM can generate new musical pieces; however, by analyzing the model's responses and internal structure, we find that the learned information is not stored in a form directly interpretable by humans. This study contributes to a better understanding of how machine learning models capable of music composition may internally represent musical structure and highlights issues related to the interpretability of generative models in creative tasks.
