Table of Contents
Fetching ...

WeMusic-Agent: Efficient Conversational Music Recommendation via Knowledge Internalization and Agentic Boundary Learning

Wendong Bi, Yirong Mao, Xianglong Liu, Kai Tian, Jian Zhang, Hanjie Wang, Wenhui Que

TL;DR

WeMusic-Agent tackles the challenge of conversational music recommendation by uniting extensive music knowledge internalization with an agentic boundary learning framework that decides when to invoke external tools. The authors introduce MusicCPT and WeMusic-Base to internalize music knowledge via large-scale continual pretraining and multi-turn SFT, augmented by a multi-objective reinforcement learning regime. They then extend to WeMusic-Agent-M1 through curriculum learning and controllable RL to balance internal knowledge and tool use, demonstrating improved personalization, relevance, and efficiency on the new WeMusic-Bench benchmark derived from WeChat Listen data. The results show that agentic boundary learning expands capability boundaries beyond purely internalized models, enabling robust playlist-level recommendations and efficient tool usage. The work provides a practical framework and dataset for evaluating CRS in real-world, language-specific settings and highlights the value of combining knowledge internalization with targeted tool use in music recommendation.

Abstract

Personalized music recommendation in conversational scenarios usually requires a deep understanding of user preferences and nuanced musical context, yet existing methods often struggle with balancing specialized domain knowledge and flexible tool integration. This paper proposes WeMusic-Agent, a training framework for efficient LLM-based conversational music recommendation. By integrating the knowledge internalization and agentic boundary learning, the framework aims to teach the model to intelligently decide when to leverage internalized knowledge and when to call specialized tools (e.g., music retrieval APIs, music recommendation systems). Under this framework, we present WeMusic-Agent-M1, an agentic model that internalizes extensive musical knowledge via continued pretraining on 50B music-related corpus while acquiring the ability to invoke external tools when necessary. Additionally, considering the lack of open-source benchmarks for conversational music recommendation, we also construct a benchmark for personalized music recommendations derived from real-world data in WeChat Listen. This benchmark enables comprehensive evaluation across multiple dimensions, including relevance, personalization, and diversity of the recommendations. Experiments on real-world data demonstrate that WeMusic-Agent achieves significant improvements over existing models.

WeMusic-Agent: Efficient Conversational Music Recommendation via Knowledge Internalization and Agentic Boundary Learning

TL;DR

WeMusic-Agent tackles the challenge of conversational music recommendation by uniting extensive music knowledge internalization with an agentic boundary learning framework that decides when to invoke external tools. The authors introduce MusicCPT and WeMusic-Base to internalize music knowledge via large-scale continual pretraining and multi-turn SFT, augmented by a multi-objective reinforcement learning regime. They then extend to WeMusic-Agent-M1 through curriculum learning and controllable RL to balance internal knowledge and tool use, demonstrating improved personalization, relevance, and efficiency on the new WeMusic-Bench benchmark derived from WeChat Listen data. The results show that agentic boundary learning expands capability boundaries beyond purely internalized models, enabling robust playlist-level recommendations and efficient tool usage. The work provides a practical framework and dataset for evaluating CRS in real-world, language-specific settings and highlights the value of combining knowledge internalization with targeted tool use in music recommendation.

Abstract

Personalized music recommendation in conversational scenarios usually requires a deep understanding of user preferences and nuanced musical context, yet existing methods often struggle with balancing specialized domain knowledge and flexible tool integration. This paper proposes WeMusic-Agent, a training framework for efficient LLM-based conversational music recommendation. By integrating the knowledge internalization and agentic boundary learning, the framework aims to teach the model to intelligently decide when to leverage internalized knowledge and when to call specialized tools (e.g., music retrieval APIs, music recommendation systems). Under this framework, we present WeMusic-Agent-M1, an agentic model that internalizes extensive musical knowledge via continued pretraining on 50B music-related corpus while acquiring the ability to invoke external tools when necessary. Additionally, considering the lack of open-source benchmarks for conversational music recommendation, we also construct a benchmark for personalized music recommendations derived from real-world data in WeChat Listen. This benchmark enables comprehensive evaluation across multiple dimensions, including relevance, personalization, and diversity of the recommendations. Experiments on real-world data demonstrate that WeMusic-Agent achieves significant improvements over existing models.

Paper Structure

This paper contains 39 sections, 13 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: An overview of the framework of WeMusic-Agent.
  • Figure 2: Song description snippet mining and song-centric bidirectional augmenation pipeline. First, articles and comments of the same song are filtered and summarized into high quality snippets by DeepSeek-V3 deepseek_v3. Then song-related information including the above description snippets, singer-related information, tags, and etc are linked into song-centric graph where subgraph with multiple nodes are sampled to synthesize song-to-description and description-to-song training data.
  • Figure 3: Music domain Continual Pre-Training(MuCPT) method overview
  • Figure 5: The Agent Boundary Learning framework of WeMusic-Agent
  • Figure 6: Comparison of WeMusic and other state-of-the-art LLMs on WeMusic-Bench. And we evaluate WeMusic-Base-Dist and WeMusic-Agent in the size of 32B parameters.
  • ...and 4 more figures