Table of Contents
Fetching ...

PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training

Xiao Liang, Zijian Zhao, Weichao Zeng, Yutong He, Fupeng He, Yiyi Wang, Chengying Gao

TL;DR

PianoBART addresses the challenge of jointly learning symbolic piano music generation and understanding in the absence of abundant labeled data. It introduces a BART-based, encoder-decoder framework that encodes symbolic music as octuple tokens and trains with a multi-level object selection strategy to prevent information leakage and capture long-range musical structure. The key contributions include the octuple representation, six pre-training object-selection methods across token/element and time-span levels, and strong empirical results showing coherent long-form generation and robust music understanding across multiple datasets and tasks. This approach enables scalable, unified modeling of symbolic music with potential impact on automated composition, music analysis, and downstream MIR tasks.

Abstract

Learning musical structures and composition patterns is necessary for both music generation and understanding, but current methods do not make uniform use of learned features to generate and comprehend music simultaneously. In this paper, we propose PianoBART, a pre-trained model that uses BART for both symbolic piano music generation and understanding. We devise a multi-level object selection strategy for different pre-training tasks of PianoBART, which can prevent information leakage or loss and enhance learning ability. The musical semantics captured in pre-training are fine-tuned for music generation and understanding tasks. Experiments demonstrate that PianoBART efficiently learns musical patterns and achieves outstanding performance in generating high-quality coherent pieces and comprehending music. Our code and supplementary material are available at https://github.com/RS2002/PianoBart.

PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training

TL;DR

PianoBART addresses the challenge of jointly learning symbolic piano music generation and understanding in the absence of abundant labeled data. It introduces a BART-based, encoder-decoder framework that encodes symbolic music as octuple tokens and trains with a multi-level object selection strategy to prevent information leakage and capture long-range musical structure. The key contributions include the octuple representation, six pre-training object-selection methods across token/element and time-span levels, and strong empirical results showing coherent long-form generation and robust music understanding across multiple datasets and tasks. This approach enables scalable, unified modeling of symbolic music with potential impact on automated composition, music analysis, and downstream MIR tasks.

Abstract

Learning musical structures and composition patterns is necessary for both music generation and understanding, but current methods do not make uniform use of learned features to generate and comprehend music simultaneously. In this paper, we propose PianoBART, a pre-trained model that uses BART for both symbolic piano music generation and understanding. We devise a multi-level object selection strategy for different pre-training tasks of PianoBART, which can prevent information leakage or loss and enhance learning ability. The musical semantics captured in pre-training are fine-tuned for music generation and understanding tasks. Experiments demonstrate that PianoBART efficiently learns musical patterns and achieves outstanding performance in generating high-quality coherent pieces and comprehending music. Our code and supplementary material are available at https://github.com/RS2002/PianoBart.
Paper Structure (14 sections, 1 equation, 2 figures, 5 tables)

This paper contains 14 sections, 1 equation, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Overview of the proposed PianoBART framework (right) and the designed multi-level object selection strategy (left).
  • Figure 2: Visualization results of generated examples on ablation variants. PianoBART (w/o pretraining), PianoBART-simple, and PianoBART are all continued from Prompt.