Table of Contents
Fetching ...

ARN: Analogical Reasoning on Narratives

Zhivar Sourati, Filip Ilievski, Pia Sommerauer, Yifan Jiang

TL;DR

A comprehensive computational framework that operationalizes dominant theories of analogy, using narrative elements to create surface and system mappings, and creates a binary task and benchmark for Analogical Reasoning on Narratives (ARN), covering four categories of far (cross-domain)/near (within-domain) analogies and disanalogies.

Abstract

As a core cognitive skill that enables the transferability of information across domains, analogical reasoning has been extensively studied for both humans and computational models. However, while cognitive theories of analogy often focus on narratives and study the distinction between surface, relational, and system similarities, existing work in natural language processing has a narrower focus as far as relational analogies between word pairs. This gap brings a natural question: can state-of-the-art large language models (LLMs) detect system analogies between narratives? To gain insight into this question and extend word-based relational analogies to relational system analogies, we devise a comprehensive computational framework that operationalizes dominant theories of analogy, using narrative elements to create surface and system mappings. Leveraging the interplay between these mappings, we create a binary task and benchmark for Analogical Reasoning on Narratives (ARN), covering four categories of far (cross-domain)/near (within-domain) analogies and disanalogies. We show that while all LLMs can largely recognize near analogies, even the largest ones struggle with far analogies in a zero-shot setting, with GPT4.0 scoring below random. Guiding the models through solved examples and chain-of-thought reasoning enhances their analogical reasoning ability. Yet, since even in the few-shot setting, the best model only performs halfway between random and humans, ARN opens exciting directions for computational analogical reasoners.

ARN: Analogical Reasoning on Narratives

TL;DR

A comprehensive computational framework that operationalizes dominant theories of analogy, using narrative elements to create surface and system mappings, and creates a binary task and benchmark for Analogical Reasoning on Narratives (ARN), covering four categories of far (cross-domain)/near (within-domain) analogies and disanalogies.

Abstract

As a core cognitive skill that enables the transferability of information across domains, analogical reasoning has been extensively studied for both humans and computational models. However, while cognitive theories of analogy often focus on narratives and study the distinction between surface, relational, and system similarities, existing work in natural language processing has a narrower focus as far as relational analogies between word pairs. This gap brings a natural question: can state-of-the-art large language models (LLMs) detect system analogies between narratives? To gain insight into this question and extend word-based relational analogies to relational system analogies, we devise a comprehensive computational framework that operationalizes dominant theories of analogy, using narrative elements to create surface and system mappings. Leveraging the interplay between these mappings, we create a binary task and benchmark for Analogical Reasoning on Narratives (ARN), covering four categories of far (cross-domain)/near (within-domain) analogies and disanalogies. We show that while all LLMs can largely recognize near analogies, even the largest ones struggle with far analogies in a zero-shot setting, with GPT4.0 scoring below random. Guiding the models through solved examples and chain-of-thought reasoning enhances their analogical reasoning ability. Yet, since even in the few-shot setting, the best model only performs halfway between random and humans, ARN opens exciting directions for computational analogical reasoners.
Paper Structure (50 sections, 1 equation, 8 figures, 6 tables)

This paper contains 50 sections, 1 equation, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Analogical reasoning over narratives ($ARN$): a binary task of distinguishing between analogous narrative $A$ and distractor $N$ for the query narrative $Q$. Here, $A$ represents a far analogous narrative (forming a relational system mapping) to $Q$, while $N$ is a near disanalogy (having only surface similarities).
  • Figure 2: Our proposed framework for evaluating analogical reasoning on narratives that culminates in $ARN$: 1. We start by extracting elements of narratives; 2. We then match narratives based on similarities of their extracted elements, creating corresponding mappings; 3. Based on the combinations of mentioned mappings and given the precedence of system mappings, pairs of narratives in four categories of far/near analogies and far/near distractors are organized to create the $ARN$ benchmark to evaluate LLMs' analogical reasoning in distinct scenarios.
  • Figure 3: Interplay between surface and system similarities and analogical categories following gentner1983structure and holyoak1996mental.
  • Figure 4: GPT4.0 and UnifiedQA-11B's performance grouped jointly by the type of analogies and distractors.
  • Figure 5: Performance of Llama-2, GPT3.5, and GPT4.0 on $ARN$ where $\{0, 2, 4, 6\}$ randomly chosen solved demonstrations from both far and near analogies were shown to the model as hints. Demonstrations were provided both as normally solved demonstrations and Chain-of-Thought reasoning, denoted as CoT.
  • ...and 3 more figures