Table of Contents
Fetching ...

MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection

Tongyu Lu, Charlotta-Marlena Geist, Jan Melechovsky, Abhinaba Roy, Dorien Herremans

TL;DR

MelodySim tackles the challenge of melody-aware plagiarism detection in the era of generative music by creating an open, melody-preserving audio dataset and a triplet-based embedding model. The dataset augments Slakh2100 MIDI originals through MIDI- and audio-level transformations that preserve melody while altering texture, instrument, and tempo, enabling robust training of melody-sensitive embeddings via a MERT-backed Triplet Neural Network. The model produces segment-level and piece-level similarity measures, outperforming DTW baselines on MelodySim and showing moderate generalization to real-world MCIC cases, with subjective listening studies validating the melodic preservation in augmented variations. This work provides practical tools for attribution and copyright considerations in generative-music workflows and lays groundwork for broader melody-centric content protection in multimedia AI.

Abstract

We propose MelodySim, a melody-aware music similarity model and dataset for plagiarism detection. First, we introduce a novel method to construct a dataset focused on melodic similarity. By augmenting Slakh2100, an existing MIDI dataset, we generate variations of each piece while preserving the melody through modifications such as note splitting, arpeggiation, minor track dropout, and re-instrumentation. A user study confirms that positive pairs indeed contain similar melodies, while other musical tracks are significantly changed. Second, we develop a segment-wise melodic-similarity detection model that uses a MERT encoder and applies a triplet neural network to capture melodic similarity. The resulting decision matrix highlights where plagiarism might occur. The experiments show that our model is able to outperform baseline models in detecting similar melodic fragments on the MelodySim test set.

MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection

TL;DR

MelodySim tackles the challenge of melody-aware plagiarism detection in the era of generative music by creating an open, melody-preserving audio dataset and a triplet-based embedding model. The dataset augments Slakh2100 MIDI originals through MIDI- and audio-level transformations that preserve melody while altering texture, instrument, and tempo, enabling robust training of melody-sensitive embeddings via a MERT-backed Triplet Neural Network. The model produces segment-level and piece-level similarity measures, outperforming DTW baselines on MelodySim and showing moderate generalization to real-world MCIC cases, with subjective listening studies validating the melodic preservation in augmented variations. This work provides practical tools for attribution and copyright considerations in generative-music workflows and lays groundwork for broader melody-centric content protection in multimedia AI.

Abstract

We propose MelodySim, a melody-aware music similarity model and dataset for plagiarism detection. First, we introduce a novel method to construct a dataset focused on melodic similarity. By augmenting Slakh2100, an existing MIDI dataset, we generate variations of each piece while preserving the melody through modifications such as note splitting, arpeggiation, minor track dropout, and re-instrumentation. A user study confirms that positive pairs indeed contain similar melodies, while other musical tracks are significantly changed. Second, we develop a segment-wise melodic-similarity detection model that uses a MERT encoder and applies a triplet neural network to capture melodic similarity. The resulting decision matrix highlights where plagiarism might occur. The experiments show that our model is able to outperform baseline models in detecting similar melodic fragments on the MelodySim test set.

Paper Structure

This paper contains 27 sections, 6 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: The proposed melody-aware augmentation pipeline used for constructing the MelodySim dataset by augmenting Slakh MIDI.
  • Figure 2: The proposed architecture for training and inference. $\mathrm{sg}[\cdot]$ means "stop gradient" and $\mathrm{abs(\cdot)}$ notates element-wise absolute function.
  • Figure 3: Similarity matrices for positive (left) and negative (right) pairs from the test set.