Reasoning Does Not Necessarily Improve Role-Playing Ability

Xiachong Feng; Longxu Dou; Lingpeng Kong

Reasoning Does Not Necessarily Improve Role-Playing Ability

Xiachong Feng, Longxu Dou, Lingpeng Kong

TL;DR

The paper interrogates whether reasoning techniques enhance the role-playing abilities of large language models by conducting a large-scale, standardized evaluation across six benchmarks and 24 LLMs with three prompting strategies. It uncovers that Chain-of-Thought can hinder performance, reasoning-optimized models are ill-suited for role-playing, and reasoning disrupts scaling laws, while the Qwen series excels and Chinese role-playing often surpasses English. The authors propose two future directions—role-aware CoT and reinforcement learning for role-playing—to improve persona consistency and adaptive behavior, and they deliver a standardized OpenCompass framework to enable reproducible research. The findings offer practical guidance for deploying role-playing LLMs and shape directions for integrating reasoning with character-driven AI systems.

Abstract

The application of role-playing large language models (LLMs) is rapidly expanding in both academic and commercial domains, driving an increasing demand for high-precision role-playing models. Simultaneously, the rapid advancement of reasoning techniques has continuously pushed the performance boundaries of LLMs. This intersection of practical role-playing demands and evolving reasoning capabilities raises an important research question: "Can reasoning techniques enhance the role-playing capabilities of LLMs?" To address this, we conduct a comprehensive study using 6 role-playing benchmarks, 24 LLMs, and 3 distinct role-playing strategies, comparing the effectiveness of direct zero-shot role-playing, role-playing with Chain-of-Thought (CoT), and role-playing using reasoning-optimized LLMs. Our findings reveal that CoT may reduce role-playing performance, reasoning-optimized LLMs are unsuitable for role-playing, reasoning ability disrupts the role-playing scaling law, large models still lack proficiency in advanced role-playing, and Chinese role-playing performance surpasses English role-playing performance. Furthermore, based on extensive experimental results, we propose two promising future research directions: Role-aware CoT for improving role-playing LLMs and Reinforcement Learning for role-playing LLMs, aiming to enhance the adaptability, consistency, and effectiveness of role-playing LLMs for both research and real-world applications.

Reasoning Does Not Necessarily Improve Role-Playing Ability

TL;DR

Abstract

Reasoning Does Not Necessarily Improve Role-Playing Ability

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)