Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives
Runcong Zhao, Qinglin Zhu, Hainiu Xu, Jiazheng Li, Yuxiang Zhou, Yulan He, Lin Gui
TL;DR
This work introduces Conan, a benchmark and dataset to study how large language models comprehend complex, multi-perspective character relationships in detective narratives. It defines a task framework with Character Extraction, Entity Linking, and Relation Deduction, and provides a hierarchical taxonomy (5 top-level, 54 intermediate, 163 detailed) to evaluate nuanced social connections, including public, secret, and inferred relations. Through experiments with GPT-3.5, GPT-4, and Llama2, the authors demonstrate that current models struggle with long narratives and conflicting perspectives, and they analyze three relation-detection strategies (AllTogether, DirRelation, PairRelation) plus ablation studies on character extraction quality. The findings highlight the need for improved inferential reasoning and information-management approaches (e.g., retrieval augmentation, chain-of-thought) to advance narrative understanding, with implications for creative writing analysis, interactive agents, and theory-of-mind research.
Abstract
Existing datasets for narrative understanding often fail to represent the complexity and uncertainty of relationships in real-life social scenarios. To address this gap, we introduce a new benchmark, Conan, designed for extracting and analysing intricate character relation graphs from detective narratives. Specifically, we designed hierarchical relationship categories and manually extracted and annotated role-oriented relationships from the perspectives of various characters, incorporating both public relationships known to most characters and secret ones known to only a few. Our experiments with advanced Large Language Models (LLMs) like GPT-3.5, GPT-4, and Llama2 reveal their limitations in inferencing complex relationships and handling longer narratives. The combination of the Conan dataset and our pipeline strategy is geared towards understanding the ability of LLMs to comprehend nuanced relational dynamics in narrative contexts.
