Table of Contents
Fetching ...

Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling

Hang Zheng, Hongshen Xu, Yuncong Liu, Lu Chen, Pascale Fung, Kai Yu

TL;DR

This work tackles LLM hallucinations by explicitly modeling knowledge boundaries and separating output generation into fast, confidence-labeled responses and slower, refinement-driven processing. The Explicit Knowledge Boundary Modeling (EKBM) framework combines a fast-thinking module for high-confidence predictions with a slow-thinking refinement module to improve uncertain outputs, guided by an alignment objective and a training pipeline using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) with a tailored Weighted-F1 metric. Reliability is evaluated via Quality-F1 and Optimal-F1—instances of Weighted-F1—across dialogue state tracking tasks, with experiments showing improved self-awareness, reliability, and scalable efficiency; refinement boosts accuracy with manageable overhead. The approach generalizes to mathematical reasoning (e.g., GSM8K) and provides a practical paradigm for deploying reliable LLMs in error-sensitive settings by balancing immediate utility and post-refinement performance.

Abstract

Large language models (LLMs) are prone to hallucination stemming from misaligned self-awareness, particularly when processing queries exceeding their knowledge boundaries. While existing mitigation strategies employ uncertainty estimation or query rejection mechanisms, they suffer from computational efficiency and sacrificed helpfulness. To address these issues, we propose the Explicit Knowledge Boundary Modeling (EKBM) framework, integrating fast and slow reasoning systems to harmonize reliability and usability. The framework first employs a fast-thinking model to generate confidence-labeled responses, enabling immediate utilization of high-confidence outputs, whereas uncertain predictions trigger a slow refinement model for accuracy improvement. To align model behavior with our proposed object, we propose a hybrid training pipeline, enhancing self-awareness without degrading task performance. Evaluations on dialogue state tracking tasks demonstrate that EKBM achieves superior model reliability over uncertainty-based baselines. Further analysis reveals that refinement substantially boosts accuracy while maintaining low computational overhead. The framework establishes a scalable paradigm for deploying reliable LLMs in error-sensitive applications, effectively balancing accuracy and practical utility.

Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling

TL;DR

This work tackles LLM hallucinations by explicitly modeling knowledge boundaries and separating output generation into fast, confidence-labeled responses and slower, refinement-driven processing. The Explicit Knowledge Boundary Modeling (EKBM) framework combines a fast-thinking module for high-confidence predictions with a slow-thinking refinement module to improve uncertain outputs, guided by an alignment objective and a training pipeline using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) with a tailored Weighted-F1 metric. Reliability is evaluated via Quality-F1 and Optimal-F1—instances of Weighted-F1—across dialogue state tracking tasks, with experiments showing improved self-awareness, reliability, and scalable efficiency; refinement boosts accuracy with manageable overhead. The approach generalizes to mathematical reasoning (e.g., GSM8K) and provides a practical paradigm for deploying reliable LLMs in error-sensitive settings by balancing immediate utility and post-refinement performance.

Abstract

Large language models (LLMs) are prone to hallucination stemming from misaligned self-awareness, particularly when processing queries exceeding their knowledge boundaries. While existing mitigation strategies employ uncertainty estimation or query rejection mechanisms, they suffer from computational efficiency and sacrificed helpfulness. To address these issues, we propose the Explicit Knowledge Boundary Modeling (EKBM) framework, integrating fast and slow reasoning systems to harmonize reliability and usability. The framework first employs a fast-thinking model to generate confidence-labeled responses, enabling immediate utilization of high-confidence outputs, whereas uncertain predictions trigger a slow refinement model for accuracy improvement. To align model behavior with our proposed object, we propose a hybrid training pipeline, enhancing self-awareness without degrading task performance. Evaluations on dialogue state tracking tasks demonstrate that EKBM achieves superior model reliability over uncertainty-based baselines. Further analysis reveals that refinement substantially boosts accuracy while maintaining low computational overhead. The framework establishes a scalable paradigm for deploying reliable LLMs in error-sensitive applications, effectively balancing accuracy and practical utility.

Paper Structure

This paper contains 46 sections, 9 equations, 8 figures, 15 tables.

Figures (8)

  • Figure 1: A case study on Dialog State Tracking: comparison of different alignment objectives.
  • Figure 2: The EKBM framework and the data construction methods.
  • Figure 3: Comparison of different methods under foundation models of various task abibility. Figure (a), (b) and (c) illustrates the JGA after refinement (%) on three datasets. Chosen representative methods: Prompt-based: Direct; Uncertainty-based: Prob; Reliability: SFT+DPO Post Training.
  • Figure 4: Comparison of DPO preference strategies on model behavior and performance.
  • Figure 5: Percentage of predictions types among different methods. Chosen representative methods: Prompt-based: Direct; Uncertainty-based: Prob; Reliability: SFT+DPO Post Training.
  • ...and 3 more figures