Table of Contents
Fetching ...

Interactive AI NPCs Powered by LLMs: Technical Report for the CPDC Challenge 2025

Yitian Huang, Yuxuan Lei, Jianxun Lian, Hao Liao

TL;DR

The paper tackles NPC dialogue grounded in persona and world knowledge within CPDC 2025, addressing Task 1 (tool invocation), Task 2 (context-aware dialogue), and Task 3 (integration). It introduces Context Engineering—combining adaptive tool pruning, persona distillation, post-processing, and prompt optimization—and applies GRPO training in the GPU Track to directly optimize reward signals, mitigating overfitting on small datasets. Empirically, the approach yields strong results across API and GPU tracks (e.g., substantial gains from Context Engineering; GRPO providing additional improvements with noted reward-hacking risks), and highlights limitations of SFT and LLM-as-a-judge evaluation strategies. The work demonstrates practical improvements in tool-call stability and empathetic, persona-consistent dialogue, and provides a public codebase for reproducing and extending NPC dialogue systems.

Abstract

This report presents the solution and results of our team MSRA\_SC in the Commonsense Persona-Grounded Dialogue Challenge (CPDC 2025). We propose a simple yet effective framework that unifies improvements across both GPU Track and API Track. Our method centers on two key components. First, Context Engineering applies dynamic tool pruning and persona clipping for input compression, combined with post-processing techniques such as parameter normalization and function merging. Together with manually refined prompts, this design improves tool call stability, execution reliability, and role-playing guidance. Second, in the GPU Track, we further adopt GRPO training, replacing supervised fine-tuning with reinforcement learning directly optimized by reward signals. This mitigates small-sample overfitting and significantly enhances task-oriented dialogue performance. In the final evaluation, our team ranks 1st in Task 2 API, 2nd in Task 1 API, and 3rd in both Task 3 API and GPU track, demonstrating the effectiveness of our approach. Our code is publicly available at https://gitlab.aicrowd.com/nikoo_yu/cpdc-2025-winning-solution

Interactive AI NPCs Powered by LLMs: Technical Report for the CPDC Challenge 2025

TL;DR

The paper tackles NPC dialogue grounded in persona and world knowledge within CPDC 2025, addressing Task 1 (tool invocation), Task 2 (context-aware dialogue), and Task 3 (integration). It introduces Context Engineering—combining adaptive tool pruning, persona distillation, post-processing, and prompt optimization—and applies GRPO training in the GPU Track to directly optimize reward signals, mitigating overfitting on small datasets. Empirically, the approach yields strong results across API and GPU tracks (e.g., substantial gains from Context Engineering; GRPO providing additional improvements with noted reward-hacking risks), and highlights limitations of SFT and LLM-as-a-judge evaluation strategies. The work demonstrates practical improvements in tool-call stability and empathetic, persona-consistent dialogue, and provides a public codebase for reproducing and extending NPC dialogue systems.

Abstract

This report presents the solution and results of our team MSRA\_SC in the Commonsense Persona-Grounded Dialogue Challenge (CPDC 2025). We propose a simple yet effective framework that unifies improvements across both GPU Track and API Track. Our method centers on two key components. First, Context Engineering applies dynamic tool pruning and persona clipping for input compression, combined with post-processing techniques such as parameter normalization and function merging. Together with manually refined prompts, this design improves tool call stability, execution reliability, and role-playing guidance. Second, in the GPU Track, we further adopt GRPO training, replacing supervised fine-tuning with reinforcement learning directly optimized by reward signals. This mitigates small-sample overfitting and significantly enhances task-oriented dialogue performance. In the final evaluation, our team ranks 1st in Task 2 API, 2nd in Task 1 API, and 3rd in both Task 3 API and GPU track, demonstrating the effectiveness of our approach. Our code is publicly available at https://gitlab.aicrowd.com/nikoo_yu/cpdc-2025-winning-solution

Paper Structure

This paper contains 20 sections, 6 equations, 2 figures, 4 tables, 2 algorithms.

Figures (2)

  • Figure 1: Tool call Train and eval curves (Qwen3-8B/14B, GRPO)
  • Figure 2: A case study for GRPO training with LLM-as-a-judge reward signals.