AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Nikolas Karafyllis; Maria Lymperaiou; Giorgos Filandrianos; Athanasios Voulodimos; Giorgos Stamou

AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Nikolas Karafyllis, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou

Abstract

We present a winning three-stage system for SemEval 2026 Task~12: Abductive Event Reasoning that combines graph-based retrieval, LLM-driven abductive reasoning with prompt design optimized through reflective prompt evolution, and post-hoc consistency enforcement; our system ranks first on the evaluation-phase leaderboard with an accuracy score of 0.95. Cross-model error analysis across 14 models (7~families) reveals three shared inductive biases: causal chain incompleteness, proximate cause preference, and salience bias, whose cross-family convergence (51\% cause-count reduction) indicates systematic rather than model-specific failure modes in multi-label causal reasoning.

AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Abstract

Paper Structure (64 sections, 1 equation, 21 figures, 18 tables, 1 algorithm)

This paper contains 64 sections, 1 equation, 21 figures, 18 tables, 1 algorithm.

Introduction
Background
Task description
Related work
System Overview
Retrieval: Distractor Filtering
Hybrid Document Graphs.
Topic-Wide Aggregation.
Inference
Structured Prompting.
Prompt Design.
Self-Consistency.
Post-Hoc Consistency Enforcement
Experimental Setup
Dataset
...and 49 more sections

Figures (21)

Figure 1: System pipeline. Stage 1 constructs a hybrid document graph (Figure \ref{['fig:graphrag']}), selects dense/sparse entry points, retrieves the connected component, and filters disconnected distractors. Stage 2 performs structured analysis-before-answer prompting with self-consistency. Stage 3 applies eight post-hoc consistency heuristics.
Figure 2: Hybrid document-graph retrieval in three steps. Step 1: Build a hybrid similarity graph ($\alpha{=}0.7$ dense $+$$0.3$ sparse); disconnected documents ($d_9$--$d_{12}$) are potential distractors. Step 2: At query time, pick entry points from dense and sparse signals (3$+$2, deduplicated). Step 3: Retrieve the full connected component from the seeds, filter disconnected documents, and pass the selected topic context to the LLM reasoner.
Figure 3: Dataset composition across splits.
Figure 4: Answer frequency by option.
Figure 5: Distribution of answer cardinality.
...and 16 more figures

AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Abstract

AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Authors

Abstract

Table of Contents

Figures (21)