Table of Contents
Fetching ...

Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks

Yahui Fu, Haiyue Song, Tianyu Zhao, Tatsuya Kawahara

TL;DR

This work tackles speaker sparsity in dialogue-based personality recognition by proposing a data interpolation augmentation that creates synthetic dialogues with continuous trait labels. It also introduces a heterogeneous conversational graph neural network (HC-GNN) that independently models inter-speaker and intra-speaker dependencies to better capture personality expression in dialogue. Evaluations on RealPersonaChat show that increased speaker diversity improves performance across monologue and dialogue settings, with HC-GNN achieving superior results in dialogue and data augmentation contributing notable gains in both settings. The approach advances robust, speaker-independent personality recognition with practical implications for adaptive human-robot interaction.

Abstract

Personality recognition is useful for enhancing robots' ability to tailor user-adaptive responses, thus fostering rich human-robot interactions. One of the challenges in this task is a limited number of speakers in existing dialogue corpora, which hampers the development of robust, speaker-independent personality recognition models. Additionally, accurately modeling both the interdependencies among interlocutors and the intra-dependencies within the speaker in dialogues remains a significant issue. To address the first challenge, we introduce personality trait interpolation for speaker data augmentation. For the second, we propose heterogeneous conversational graph networks to independently capture both contextual influences and inherent personality traits. Evaluations on the RealPersonaChat corpus demonstrate our method's significant improvements over existing baselines.

Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks

TL;DR

This work tackles speaker sparsity in dialogue-based personality recognition by proposing a data interpolation augmentation that creates synthetic dialogues with continuous trait labels. It also introduces a heterogeneous conversational graph neural network (HC-GNN) that independently models inter-speaker and intra-speaker dependencies to better capture personality expression in dialogue. Evaluations on RealPersonaChat show that increased speaker diversity improves performance across monologue and dialogue settings, with HC-GNN achieving superior results in dialogue and data augmentation contributing notable gains in both settings. The approach advances robust, speaker-independent personality recognition with practical implications for adaptive human-robot interaction.

Abstract

Personality recognition is useful for enhancing robots' ability to tailor user-adaptive responses, thus fostering rich human-robot interactions. One of the challenges in this task is a limited number of speakers in existing dialogue corpora, which hampers the development of robust, speaker-independent personality recognition models. Additionally, accurately modeling both the interdependencies among interlocutors and the intra-dependencies within the speaker in dialogues remains a significant issue. To address the first challenge, we introduce personality trait interpolation for speaker data augmentation. For the second, we propose heterogeneous conversational graph networks to independently capture both contextual influences and inherent personality traits. Evaluations on the RealPersonaChat corpus demonstrate our method's significant improvements over existing baselines.
Paper Structure (22 sections, 9 equations, 3 figures, 8 tables)

This paper contains 22 sections, 9 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: Homogeneous and different heterogeneous models. $u_{a1},u_{a2},u_{b1}$ represents alternant utterance of speaker$a$ and $b$. $\sigma (\cdot)$ represents activation function.
  • Figure 2: Heterogeneous conversational graph neural network (HC-GNN), which captures the interdependencies among interlocutors (acquired) and the intra-dependencies within speaker a or b (innate).
  • Figure 3: Data distribution of augmented data and original data.