Table of Contents
Fetching ...

Controlling Surprisal in Music Generation via Information Content Curve Matching

Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer

TL;DR

This paper tackles controlling musical surprisal in generative systems by introducing Instantaneous Information Content ($IIC$), a time-local proxy for listener surprisal that can be computed at any point in a piece. It defines an $IIC$-based target curve $ ext{IC}^*(t)$, an IC deviation objective, and an IC conditioned sampling procedure that uses beam search to generate sequences whose $IIC$ closely matches the target curve, even with irregular timing. The authors demonstrate that $IIC$ correlates with harmonic tension and rhythmic density, and they validate perceptual relevance with a human study showing that listeners can identify target $IIC$ curves in generated music. The approach leverages a Transformer-based symbolic music model and a time-localization framework with a Hann window to map token-level surprisal to the continuous timeline, enabling controllable generation and potential personalization in future work.

Abstract

In recent years, the quality and public interest in music generation systems have grown, encouraging research into various ways to control these systems. We propose a novel method for controlling surprisal in music generation using sequence models. To achieve this goal, we define a metric called Instantaneous Information Content (IIC). The IIC serves as a proxy function for the perceived musical surprisal (as estimated from a probabilistic model) and can be calculated at any point within a music piece. This enables the comparison of surprisal across different musical content even if the musical events occur in irregular time intervals. We use beam search to generate musical material whose IIC curve closely approximates a given target IIC. We experimentally show that the IIC correlates with harmonic and rhythmic complexity and note density. The correlation decreases with the length of the musical context used for estimating the IIC. Finally, we conduct a qualitative user study to test if human listeners can identify the IIC curves that have been used as targets when generating the respective musical material. We provide code for creating IIC interpolations and IIC visualizations on https://github.com/muthissar/iic.

Controlling Surprisal in Music Generation via Information Content Curve Matching

TL;DR

This paper tackles controlling musical surprisal in generative systems by introducing Instantaneous Information Content (), a time-local proxy for listener surprisal that can be computed at any point in a piece. It defines an -based target curve , an IC deviation objective, and an IC conditioned sampling procedure that uses beam search to generate sequences whose closely matches the target curve, even with irregular timing. The authors demonstrate that correlates with harmonic tension and rhythmic density, and they validate perceptual relevance with a human study showing that listeners can identify target curves in generated music. The approach leverages a Transformer-based symbolic music model and a time-localization framework with a Hann window to map token-level surprisal to the continuous timeline, enabling controllable generation and potential personalization in future work.

Abstract

In recent years, the quality and public interest in music generation systems have grown, encouraging research into various ways to control these systems. We propose a novel method for controlling surprisal in music generation using sequence models. To achieve this goal, we define a metric called Instantaneous Information Content (IIC). The IIC serves as a proxy function for the perceived musical surprisal (as estimated from a probabilistic model) and can be calculated at any point within a music piece. This enables the comparison of surprisal across different musical content even if the musical events occur in irregular time intervals. We use beam search to generate musical material whose IIC curve closely approximates a given target IIC. We experimentally show that the IIC correlates with harmonic and rhythmic complexity and note density. The correlation decreases with the length of the musical context used for estimating the IIC. Finally, we conduct a qualitative user study to test if human listeners can identify the IIC curves that have been used as targets when generating the respective musical material. We provide code for creating IIC interpolations and IIC visualizations on https://github.com/muthissar/iic.
Paper Structure (19 sections, 7 equations, 5 figures, 2 tables)

This paper contains 19 sections, 7 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The temporal localization function $f$ and the weight function $\lambda$, involved in computing the $\text{IIC}$ of $x_1,x_2,...$, a sequence of three notes, at time $t_1$.
  • Figure 2: Example page of the user study with a generated musical section and five target curves to choose from.
  • Figure 3: The confusion matrix for users identifying the $\text{IC}^{*}$ curves used to generate the music examples.
  • Figure 4: Correlation between IIC and tonal tension $\mathit{tt}$, note density $d$, and IOI histogram entropy ($he$).
  • Figure :