Learning the Boundary of Solvability: Aligning LLMs to Detect Unsolvable Problems
Dengyun Peng, Qiguang Chen, Bofei Liu, Jiannan Guan, Libo Qin, Zheng Yan, Jinhao Liu, Jianshu Zhang, Wanxiang Che
TL;DR
<3-5 sentence high-level summary> The paper addresses the problem of distinguishing inherent unsolvability from model capability limits in reasoning tasks. It introduces UnsolvableQA, a dataset constructed via a novel Reverse Construction method and logic-puzzle generators, and UnsolvableRL, a reinforcement learning framework with a dynamic, three-component reward system to train models to solve solvable tasks, detect unsolvability, and calibrate refusals. Empirical results show near-perfect unsolvability detection and substantial gains in solvable-task accuracy, along with the identification of Capability Collapse when unsolvability data is not included. The work demonstrates the necessity of explicit unsolvability data to prevent overconfidence and provides a practical approach for building more reliable AI systems that know when not to answer.
Abstract
Ensuring LLM reliability requires not only solving complex problems but also recognizing when a problem is unsolvable. Current models often struggle to distinguish objective unsolvability (inherent contradictions in the problem) from subjective capability limitations (problems beyond the model's competence), which leads to hallucinations and overconfidence. To address this, we propose UnsolvableQA and UnsolvableRL to solve feasible problems, detect inherent contradictions, and prudently refuse tasks beyond capability. Specifically, we construct UnsolvableQA, a dataset of paired solvable and unsolvable instances derived via a dual-track methodology: programmatic generation for logic puzzles and a novel "Reverse Construction" method that injects contradictions into valid reasoning chains for mathematics. Building on this dataset, we introduce UnsolvableRL, a reinforcement learning framework with three reward components jointly accounting for accuracy, unsolvability, and difficulty. Empirical results show that our approach achieves near-perfect unsolvability detection while also improving accuracy on solvable tasks. Crucially, we identify Capability Collapse, demonstrating that explicit exposure to unsolvable data is indispensable for preventing models from becoming systematically overconfident. Our code and data are available at https://github.com/sfasfaffa/unsolvableQA.
