Table of Contents
Fetching ...

KnowTuning: Knowledge-aware Fine-tuning for Large Language Models

Yougang Lyu, Lingyong Yan, Shuaiqiang Wang, Haibo Shi, Dawei Yin, Pengjie Ren, Zhumin Chen, Maarten de Rijke, Zhaochun Ren

TL;DR

KnowTuning addresses the insufficiency of knowledge awareness in vanilla fine-tuning by introducing a dual-stage approach: fine-grained knowledge augmentation to surface difficult atomic facts, and coarse-grained knowledge comparison to ensure completeness, factuality, and logicality. It relies on perplexity-based filtering and rewriting to create fine-grained QA pairs, and uses deletion, revision, and shuffling to compose robust comparison sets optimized with a DPO-based objective plus SFT. Across generic and medical QA tasks, KnowTuning improves completeness, factuality, and logicality, while also increasing the proportion of correct fine-grained facts and reducing factual errors. The method demonstrates that explicit, multi-level knowledge awareness during fine-tuning yields robust performance gains across model sizes and domains, with practical implications for knowledge-intensive NLP applications.

Abstract

Despite their success at many natural language processing (NLP) tasks, large language models still struggle to effectively leverage knowledge for knowledge-intensive tasks, manifesting limitations such as generating incomplete, non-factual, or illogical answers. These limitations stem from inadequate knowledge awareness of LLMs during vanilla fine-tuning. To address these problems, we propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs. We devise a fine-grained knowledge augmentation stage to train LLMs to identify difficult fine-grained knowledge in answers. We also propose a coarse-grained knowledge comparison stage to train LLMs to distinguish between reliable and unreliable knowledge, in three aspects: completeness, factuality, and logicality. Extensive experiments on both generic and medical question answering (QA) datasets confirm the effectiveness of KnowTuning, through automatic and human evaluations, across various sizes of LLMs. We further verify that KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.

KnowTuning: Knowledge-aware Fine-tuning for Large Language Models

TL;DR

KnowTuning addresses the insufficiency of knowledge awareness in vanilla fine-tuning by introducing a dual-stage approach: fine-grained knowledge augmentation to surface difficult atomic facts, and coarse-grained knowledge comparison to ensure completeness, factuality, and logicality. It relies on perplexity-based filtering and rewriting to create fine-grained QA pairs, and uses deletion, revision, and shuffling to compose robust comparison sets optimized with a DPO-based objective plus SFT. Across generic and medical QA tasks, KnowTuning improves completeness, factuality, and logicality, while also increasing the proportion of correct fine-grained facts and reducing factual errors. The method demonstrates that explicit, multi-level knowledge awareness during fine-tuning yields robust performance gains across model sizes and domains, with practical implications for knowledge-intensive NLP applications.

Abstract

Despite their success at many natural language processing (NLP) tasks, large language models still struggle to effectively leverage knowledge for knowledge-intensive tasks, manifesting limitations such as generating incomplete, non-factual, or illogical answers. These limitations stem from inadequate knowledge awareness of LLMs during vanilla fine-tuning. To address these problems, we propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs. We devise a fine-grained knowledge augmentation stage to train LLMs to identify difficult fine-grained knowledge in answers. We also propose a coarse-grained knowledge comparison stage to train LLMs to distinguish between reliable and unreliable knowledge, in three aspects: completeness, factuality, and logicality. Extensive experiments on both generic and medical question answering (QA) datasets confirm the effectiveness of KnowTuning, through automatic and human evaluations, across various sizes of LLMs. We further verify that KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
Paper Structure (33 sections, 19 equations, 10 figures, 5 tables)

This paper contains 33 sections, 19 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Illustrations of vanilla fine-tuned LLM lacking knowledge awareness. (a) Vanilla fine-tuned LLM struggles to identify the fine-grained knowledge to answer a specific question precisely. (b) Vanilla fine-tuned LLM cannot effectively distinguish between reliable knowledge and unreliable knowledge in answers.
  • Figure 2: Overview of KnowTuning. KnowTuning leverages fine-grained knowledge augmentation and coarse-grained knowledge comparison to improve the knowledge awareness of LLM.
  • Figure 3: Prompts for GPT-4 evaluation.
  • Figure 4: Instructions for human evaluation.
  • Figure 5: Prompts for extracting atomic knowledge in the answer DBLP:conf/emnlp/MinKLLYKIZH23.
  • ...and 5 more figures