Trade-offs between structural richness and communication efficiency in music network representations
Lluc Bono Rosselló, Robert Jankowski, Hugues Bersini, Marián Boguñá, M. Ángeles Serrano
TL;DR
This work investigates how feature selection in music sequence representations alters the resulting network topology and the efficiency of communicating musical structure under perceptual constraints. By constructing eight encodings from single-feature to multi-feature representations and evaluating topology, entropy rate, and perceptual divergence against a cognitively informed model, the study reveals a trade-off: compressed encodings enable efficient communication with high entropy per step, while richer encodings preserve detailed musical structure but incur higher perceptual error. A key finding is that uncertainty concentrates in dynamically central nodes, and longer pieces intensify alignment between visitation and inference, suggesting that certain structural arrangements support efficient learning and prediction. The results provide a principled framework to relate representational choice to cognitive costs in sequential processing and have implications for applying similar analyses to other sequential domains.
Abstract
Music is a structured and perceptually rich sequence of sounds in time with well-defined symbolic features, whose perception is shaped by the interplay of expectation and uncertainty. Network science offers a powerful framework for studying its structural organization and communication efficiency. However, it remains unclear how feature selection affects the properties of reconstructed networks and perceptual alignment. Here, we systematically compare eight encodings of musical sequences, ranging from single-feature descriptions to richer multi-feature combinations. We show that representational choices fundamentally shape network topology, the distribution of uncertainty, and the estimated communication efficiency under perceptual constraints. Single-feature representations compress sequences into dense transition structures that support efficient communication, yielding high entropy rates with low modeled perceptual error, but they discard structural richness. By contrast, multi-feature representations preserve descriptive detail and structural specificity, expanding the state space and producing sharper transition profiles and lower entropy rates, which leads to higher modeled perceptual error. Across representations, we found that uncertainty increasingly concentrates in nodes with higher diffusion-based centrality while their perceptual error remains low, unveiling an interplay between predictable structure and localized surprise. Together, these results show that feature choice directly shapes music network representation, describing trade-offs between descriptive richness and communication efficiency and suggesting structural conditions that may support efficient learning and prediction.
