Controllable Logical Hypothesis Generation for Abductive Reasoning in Knowledge Graphs
Yisen Gao, Jiaxin Bai, Tianshi Zheng, Qingyun Sun, Ziwei Zhang, Jianxin Li, Yangqiu Song, Xingcheng Fu
TL;DR
The paper tackles controllable abductive reasoning over knowledge graphs, addressing the challenge that long, complex hypotheses are hard to control and prone to oversensitivity. It introduces CtrlHGen, a two-stage framework that combines sub-logical decomposition-based data augmentation with supervised pretraining and reinforcement learning using a smoothed semantic reward and a condition-adherence reward, optimized via Group Relative Policy Optimization. Key contributions include formal problem definition, a sub-logic augmentation strategy, and a reward design that stabilizes learning while enforcing control constraints. Empirical results on three KG benchmarks show enhanced controllability and semantic similarity under various control signals, illustrating practical value for targeted, structured hypothesis generation in domains like clinical diagnosis and scientific discovery.
Abstract
Abductive reasoning in knowledge graphs aims to generate plausible logical hypotheses from observed entities, with broad applications in areas such as clinical diagnosis and scientific discovery. However, due to a lack of controllability, a single observation may yield numerous plausible but redundant or irrelevant hypotheses on large-scale knowledge graphs. To address this limitation, we introduce the task of controllable hypothesis generation to improve the practical utility of abductive reasoning. This task faces two key challenges when controlling for generating long and complex logical hypotheses: hypothesis space collapse and hypothesis oversensitivity. To address these challenges, we propose CtrlHGen, a Controllable logcial Hypothesis Generation framework for abductive reasoning over knowledge graphs, trained in a two-stage paradigm including supervised learning and subsequent reinforcement learning. To mitigate hypothesis space collapse, we design a dataset augmentation strategy based on sub-logical decomposition, enabling the model to learn complex logical structures by leveraging semantic patterns in simpler components. To address hypothesis oversensitivity, we incorporate smoothed semantic rewards including Dice and Overlap scores, and introduce a condition-adherence reward to guide the generation toward user-specified control constraints. Extensive experiments on three benchmark datasets demonstrate that our model not only better adheres to control conditions but also achieves superior semantic similarity performance compared to baselines.
