SIG: Speaker Identification in Literature via Prompt-Based Generation

Zhenlin Su; Liyan Xu; Jin Xu; Jiangnan Li; Mingdu Huangfu

SIG: Speaker Identification in Literature via Prompt-Based Generation

Zhenlin Su, Liyan Xu, Jin Xu, Jiangnan Li, Mingdu Huangfu

TL;DR

This work proposes a simple and effective approach SIG, a generation-based method that verbalizes the task and quotation input based on designed prompt templates, which also enables easy integration of other auxiliary tasks that further bolster the speaker identification performance.

Abstract

Identifying speakers of quotations in narratives is an important task in literary analysis, with challenging scenarios including the out-of-domain inference for unseen speakers, and non-explicit cases where there are no speaker mentions in surrounding context. In this work, we propose a simple and effective approach SIG, a generation-based method that verbalizes the task and quotation input based on designed prompt templates, which also enables easy integration of other auxiliary tasks that further bolster the speaker identification performance. The prediction can either come from direct generation by the model, or be determined by the highest generation probability of each speaker candidate. Based on our approach design, SIG supports out-of-domain evaluation, and achieves open-world classification paradigm that is able to accept any forms of candidate input. We perform both cross-domain evaluation and in-domain evaluation on PDNC, the largest dataset of this task, where empirical results suggest that SIG outperforms previous baselines of complicated designs, as well as the zero-shot ChatGPT, especially excelling at those hard non-explicit scenarios by up to 17% improvement. Additional experiments on another dataset WP further corroborate the efficacy of SIG.

SIG: Speaker Identification in Literature via Prompt-Based Generation

TL;DR

Abstract

Paper Structure (34 sections, 3 equations, 3 figures, 7 tables)

This paper contains 34 sections, 3 equations, 3 figures, 7 tables.

Introduction
Background and Related Work
Speaker Identification
Previous Approaches
Template-based Approach
Methodology
Task Definition
Approach Introduction
Prompt Template Design
Prompt Template with Auxiliary Task
Inference
Direct Generation
Classification by Generation
Training
Cross-Domain Experiments
...and 19 more sections

Figures (3)

Figure 1: Illustration of our proposed approach SIG. Encoder input and decoder output are formatted according to the designed prompt templates described in Section \ref{['ssec:template']}. During inference, SIG either generates the speaker directly, or determines the speaker based on the highest generation probability of each candidate.
Figure 2: Classification by generation: each speaker candidate is fed to the decoder, and its generation probability is obtained; the highest option is selected as the final prediction.
Figure 3: t-SNE visualization of the embedding distribution on the test set for PDNC. Output from the same novel is marked by the same color.

SIG: Speaker Identification in Literature via Prompt-Based Generation

TL;DR

Abstract

SIG: Speaker Identification in Literature via Prompt-Based Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)