Table of Contents
Fetching ...

Evaluating Co-Creativity using Total Information Flow

Vignesh Gokul, Chris Francis, Shlomo Dubnov

TL;DR

This work presents a quantitative score for co-creativity in music by measuring the total information flow between two musical signals, defined as Flow = T_{X→Y} + T_{Y→X} with $T_{X→Y} = I(Y; \bar{X}|\bar{Y})$ and equivalent entropy-based expressions. It estimates the necessary entropies using a pre-trained Multitrack Music Transformer (MTMT) trained on the Symbolic Orchestral Database, enabling scalable evaluation on long sequences via a six-field MIDI representation and a sliding-window workflow. Empirical results on MuseScore-derived Score data and the URMP dataset show that higher Flow aligns with human judgments of quality and interaction, while analyses address positional bias and self-enhancement bias inherent in transformer-based estimators. The approach provides objective insights into co-creative music and suggests optimization directions and broad applicability to other domains involving interactive, multi-agent sequences.

Abstract

Co-creativity in music refers to two or more musicians or musical agents interacting with one another by composing or improvising music. However, this is a very subjective process and each musician has their own preference as to which improvisation is better for some context. In this paper, we aim to create a measure based on total information flow to quantitatively evaluate the co-creativity process in music. In other words, our measure is an indication of how "good" a creative musical process is. Our main hypothesis is that a good musical creation would maximize information flow between the participants captured by music voices recorded in separate tracks. We propose a method to compute the information flow using pre-trained generative models as entropy estimators. We demonstrate how our method matches with human perception using a qualitative study.

Evaluating Co-Creativity using Total Information Flow

TL;DR

This work presents a quantitative score for co-creativity in music by measuring the total information flow between two musical signals, defined as Flow = T_{X→Y} + T_{Y→X} with and equivalent entropy-based expressions. It estimates the necessary entropies using a pre-trained Multitrack Music Transformer (MTMT) trained on the Symbolic Orchestral Database, enabling scalable evaluation on long sequences via a six-field MIDI representation and a sliding-window workflow. Empirical results on MuseScore-derived Score data and the URMP dataset show that higher Flow aligns with human judgments of quality and interaction, while analyses address positional bias and self-enhancement bias inherent in transformer-based estimators. The approach provides objective insights into co-creative music and suggests optimization directions and broad applicability to other domains involving interactive, multi-agent sequences.

Abstract

Co-creativity in music refers to two or more musicians or musical agents interacting with one another by composing or improvising music. However, this is a very subjective process and each musician has their own preference as to which improvisation is better for some context. In this paper, we aim to create a measure based on total information flow to quantitatively evaluate the co-creativity process in music. In other words, our measure is an indication of how "good" a creative musical process is. Our main hypothesis is that a good musical creation would maximize information flow between the participants captured by music voices recorded in separate tracks. We propose a method to compute the information flow using pre-trained generative models as entropy estimators. We demonstrate how our method matches with human perception using a qualitative study.
Paper Structure (21 sections, 13 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 21 sections, 13 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Excerpt from an electronic piece "Poeme Electronique" by Edgar Varese
  • Figure 2: For a given input musical signal from a human musician, what represents the best corresponding improvisation? The response must be creative and semantically make sense.
  • Figure 3: (a,b) Comparison of information flow with respect to duration and pitch in the music for the Score dataset. (c,d) Comparison of information flow with respect to duration and pitch in the music for the URMP dataset. We see that for both the datasets and both pitch and duration, the information flow in the positive samples is more than negative samples.
  • Figure 4: We compare the information flow (for the pitch variable) with the scores obtained from the qualitative study.
  • Figure 5: Self-Enhancement Bias: (a) Comparison of information flow with respect to duration on MTMT vs AMT generated sequences, (b) Comparison of information flow with respect to pitch. We observe that the self-enhancement bias observed in popular LLMs is also observed when using MTMT as the entropy estimator.