Table of Contents
Fetching ...

ERABAL: Enhancing Role-Playing Agents through Boundary-Aware Learning

Yihong Tang, Jiao Ou, Che Liu, Fuzheng Zhang, Di Zhang, Kun Gai

TL;DR

ERABAL is presented, a framework aimed at enhancing RPLAs' role-playing capabilities through boundary-aware learning that achieves notable improvements across WikiRoleEval, CharacterEval, and the role-playing subset of MT-Bench compared to the generalist baseline models.

Abstract

Role-playing is an emerging application in the field of Human-Computer Interaction (HCI), primarily implemented through the alignment training of a large language model (LLM) with assigned characters. Despite significant progress, role-playing agents (RPLAs) still struggle with maintaining role-consistency across conversations, particularly when confronted with boundary queries subtly related to character attributes. In this paper, we present ERABAL, a framework aimed at enhancing RPLAs' role-playing capabilities through boundary-aware learning. ERABAL encompasses a generation pipeline for role-specific dialogues and a concomitant methodology for alignment training. Through comprehensive evaluations, we demonstrate that ERABAL is both efficient and effective. By training with significantly fewer dialogues than those used in leading approaches, ERABAL achieves notable improvements across WikiRoleEval, CharacterEval, and the role-playing subset of MT-Bench compared to the generalist baseline models. Our code and datasets will be made publicly available to support further research.

ERABAL: Enhancing Role-Playing Agents through Boundary-Aware Learning

TL;DR

ERABAL is presented, a framework aimed at enhancing RPLAs' role-playing capabilities through boundary-aware learning that achieves notable improvements across WikiRoleEval, CharacterEval, and the role-playing subset of MT-Bench compared to the generalist baseline models.

Abstract

Role-playing is an emerging application in the field of Human-Computer Interaction (HCI), primarily implemented through the alignment training of a large language model (LLM) with assigned characters. Despite significant progress, role-playing agents (RPLAs) still struggle with maintaining role-consistency across conversations, particularly when confronted with boundary queries subtly related to character attributes. In this paper, we present ERABAL, a framework aimed at enhancing RPLAs' role-playing capabilities through boundary-aware learning. ERABAL encompasses a generation pipeline for role-specific dialogues and a concomitant methodology for alignment training. Through comprehensive evaluations, we demonstrate that ERABAL is both efficient and effective. By training with significantly fewer dialogues than those used in leading approaches, ERABAL achieves notable improvements across WikiRoleEval, CharacterEval, and the role-playing subset of MT-Bench compared to the generalist baseline models. Our code and datasets will be made publicly available to support further research.
Paper Structure (46 sections, 5 figures, 27 tables)

This paper contains 46 sections, 5 figures, 27 tables.

Figures (5)

  • Figure 1: Illustration of boundary-aware learning. The circular areas represent the cognitive scope of the characters. High-value queries (marked in blue) are located within and near the boundaries, while ordinary queries (marked in yellow) are closer to the center. Unknown queries (marked in red), which should be dismissed with rejection responses, fall outside the circles. Both positive and negative responses are generated in our work.
  • Figure 2: The overview architecture of Erabal. DP, TM, Que Gen, and Res Gen refers to the dialogue planner, the topic manager, the question generator, and the response generator. The topic related words are presented in italics, and the counterfactual information in boundary-aware questions is underscored. The factual (positive) and counterfactual (negative) responses are colored lightgold and red, respectively.
  • Figure 3: Experimental results of Baichuan2-13B-Chat, Qwen-7B-Chat, Mistral-7B-instruct-v0.3, and LLaMA2-13B-Chat in terms of role consistency in boundary scenario. The score ranges from 0.0 to a maximum of 1.0.
  • Figure 4: Impact of data scale: experiments are conducted with varying dataset sizes using LLaMA2-7B on boundary-aware evaluation and WikiRoleEval.
  • Figure 5: Impact of model scale: experiments are conducted using 7B, 13B, and 33B versions of LLaMA2 on boundary-aware evaluation and WikiRoleEval.