Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs

Saugata Purkayastha; Pranav Kushare; Pragya Paramita Pal; Sukannya Purkayastha

Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs

Saugata Purkayastha, Pranav Kushare, Pragya Paramita Pal, Sukannya Purkayastha

TL;DR

A critical limitation of current LLMs is uncovered -- their tendency to prioritize moral reasoning over commonsense understanding, which underscores the need for enhanced reasoning-aware training to improve the commonsense robustness of large language models.

Abstract

Large Language Models (LLMs) are increasingly deployed across diverse real-world applications and user communities. As such, it is crucial that these models remain both morally grounded and knowledge-aware. In this work, we uncover a critical limitation of current LLMs -- their tendency to prioritize moral reasoning over commonsense understanding. To investigate this phenomenon, we introduce CoMoral, a novel benchmark dataset containing commonsense contradictions embedded within moral dilemmas. Through extensive evaluation of ten LLMs across different model sizes, we find that existing models consistently struggle to identify such contradictions without prior signal. Furthermore, we observe a pervasive narrative focus bias, wherein LLMs more readily detect commonsense contradictions when they are attributed to a secondary character rather than the primary (narrator) character. Our comprehensive analysis underscores the need for enhanced reasoning-aware training to improve the commonsense robustness of large language models.

Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs

TL;DR

Abstract

Paper Structure (19 sections, 1 equation, 7 figures, 2 tables)

This paper contains 19 sections, 1 equation, 7 figures, 2 tables.

Introduction
Related Work
CoMoral: A Dataset of commonsense contradictions within moral dilemmas
Data Curation
Data Validation
Final Dataset Analysis
Experimental Setup
Task Definition
Evaluation Metric
Models
Prompts
Results and Discussions
RQ1. How effectively do LLMs identify common-sense contradictions?
RQ2. How does the narrator’s role in exhibiting the common-sense contradiction affect LLM responses?
RQ3. How do model family, scale, and reasoning type impact overall performance on common-sense contradictions?
...and 4 more sections

Figures (7)

Figure 1: Different responses of LLaMa 8B Instruct to the same common-sense contradiction, 'Moonlight on a New Moon,' with Query 1 corresponding to the narrator (primary) and Query 2 corresponding to a secondary character showing 'narrative focus' bias.
Figure 2: Overview of scenario and contradiction characteristics across our dataset, CoMoral.
Figure 3: Two-shot prompt used to generate instances for our dataset, CoMoral.
Figure 4: Explicit and Implicit prompt formats used for LLM evaluation.
Figure 5: Prompt for LLM-as-a-judge
...and 2 more figures

Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs

TL;DR

Abstract

Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (7)