Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling

Hang Zheng; Hongshen Xu; Yuncong Liu; Lu Chen; Pascale Fung; Kai Yu

Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling

Hang Zheng, Hongshen Xu, Yuncong Liu, Lu Chen, Pascale Fung, Kai Yu

TL;DR

This work tackles LLM hallucinations by explicitly modeling knowledge boundaries and separating output generation into fast, confidence-labeled responses and slower, refinement-driven processing. The Explicit Knowledge Boundary Modeling (EKBM) framework combines a fast-thinking module for high-confidence predictions with a slow-thinking refinement module to improve uncertain outputs, guided by an alignment objective and a training pipeline using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) with a tailored Weighted-F1 metric. Reliability is evaluated via Quality-F1 and Optimal-F1—instances of Weighted-F1—across dialogue state tracking tasks, with experiments showing improved self-awareness, reliability, and scalable efficiency; refinement boosts accuracy with manageable overhead. The approach generalizes to mathematical reasoning (e.g., GSM8K) and provides a practical paradigm for deploying reliable LLMs in error-sensitive settings by balancing immediate utility and post-refinement performance.

Abstract

Large language models (LLMs) are prone to hallucination stemming from misaligned self-awareness, particularly when processing queries exceeding their knowledge boundaries. While existing mitigation strategies employ uncertainty estimation or query rejection mechanisms, they suffer from computational efficiency and sacrificed helpfulness. To address these issues, we propose the Explicit Knowledge Boundary Modeling (EKBM) framework, integrating fast and slow reasoning systems to harmonize reliability and usability. The framework first employs a fast-thinking model to generate confidence-labeled responses, enabling immediate utilization of high-confidence outputs, whereas uncertain predictions trigger a slow refinement model for accuracy improvement. To align model behavior with our proposed object, we propose a hybrid training pipeline, enhancing self-awareness without degrading task performance. Evaluations on dialogue state tracking tasks demonstrate that EKBM achieves superior model reliability over uncertainty-based baselines. Further analysis reveals that refinement substantially boosts accuracy while maintaining low computational overhead. The framework establishes a scalable paradigm for deploying reliable LLMs in error-sensitive applications, effectively balancing accuracy and practical utility.

Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling

TL;DR

Abstract

Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)