Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Keming Lu; Bowen Yu; Chang Zhou; Jingren Zhou

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou

TL;DR

Ditto presents a self-alignment approach that unlocks role-play in LLMs by leveraging inherent character knowledge through a self-generated WikiRole dataset (~4k roles). By reframing dialogue as reading-comprehension tasks for two-step dialogue simulation and finetuning, Ditto achieves strong open-source role-play performance, rivaling proprietary systems on identity and competitive knowledge metrics. The study also analyzes cross-supervision, showing identity is robust to weaker supervision while knowledge is capped by seed capabilities, offering guidance for scalable role-play alignment. Overall, the work demonstrates that rich role-play behavior can be induced without distilling from closed models, advancing practical, reproducible evaluation and data-generation strategies in this domain.

Abstract

Considerable efforts have been invested in augmenting the role-playing proficiency of open-source large language models (LLMs) by emulating proprietary counterparts. Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora. Thus, in this study, we introduce Ditto, a self-alignment method for role-play. Ditto capitalizes on character knowledge, encouraging an instruction-following LLM to simulate role-play dialogues as a variant of reading comprehension. This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold regarding the number of roles. Subsequently, we fine-tune the LLM using this self-generated dataset to augment its role-playing capabilities. Upon evaluating our meticulously constructed and reproducible role-play benchmark and the roleplay subset of MT-Bench, Ditto, in various parameter scales, consistently maintains a consistent role identity and provides accurate role-specific knowledge in multi-turn role-play conversations. Notably, it outperforms all open-source role-play baselines, showcasing performance levels comparable to advanced proprietary chatbots. Furthermore, we present the first comprehensive cross-supervision alignment experiment in the role-play domain, revealing that the intrinsic capabilities of LLMs confine the knowledge within role-play. Meanwhile, the role-play styles can be easily acquired with the guidance of smaller models. We open-source related resources at https://github.com/OFA-Sys/Ditto.

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

TL;DR

Abstract

Paper Structure (23 sections, 5 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 5 figures, 3 tables, 1 algorithm.

Introduction
Related Works
Methods
Problem Definition
Character Knowledge Collection
Dialogue Simulation
Supervised Finetuning
Evaluation
Metric Design
Experiments
Experimental Setup
Main Results
Analysis
Dissecting Role-play by Cross Supervision
Cross-supervision Setting
...and 8 more sections

Figures (5)

Figure 1: Ditto enlightens LLMs' roleplay capabilities by self-alignment as they have pre-trained on various character profiles and dialogues.
Figure 2: Illustration of Ditto. Ditto consists of three phrases for self-alignment of role-play. First, Ditto collects character profiles from knowledge bases, as shown in the upper part. Then, it applies an off-the-shelf chatbot to generate role-specific and contrastive queries, followed by a knowledge-augmented self-response to construct role-play supervision datasets (Dialogue Simulation). Finally, Ditto finetunes the dataset on the supervision model to empower role-play capabilities.
Figure 3: Objective evaluation of LLM role-play. We present three metrics as described in \ref{['sec:evaluation']}.
Figure 4: Human annotation for the quality of query simulation.
Figure 5: Generalization analyses between various supervision and seed LLMs. Supervision performance denotes role-play under the Ditto simulation recipe with knowledge augmentation. Imitation performance denotes the performance when seed LLMs fine-tune on simulation of certain supervision LLMs.

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

TL;DR

Abstract

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Authors

TL;DR

Abstract

Table of Contents

Figures (5)