Table of Contents
Fetching ...

General LLMs as Instructors for Domain-Specific LLMs: A Sequential Fusion Method to Integrate Extraction and Editing

Xin Zhang, Tianjie Ju, Huijia Liang, Ying Fu, Qin Zhang

TL;DR

A Sequential Fusion method to integrate knowledge from complex contexts into LLMs is introduced, which employs a two-stage framework: initially leveraging general LLMs to perform relation extraction for knowledge acquisition from complex texts, followed by updating domain-specific LLMs through Knowledge Editing (KE).

Abstract

The substantial interest in updating Large Language Models (LLMs) without retraining from scratch is accompanied by several challenges. This is particularly true when updating LLMs with datasets that necessitate domain-expert reasoning across extensive texts, despite limited samples. We termed the scenario as the Few-Shot Domain-Expert Reasoning for Updating LLMs (FDoR-UL). Traditional methods such as Low-Rank Adaptation (LoRA) and Retrieval Augmented Generation (RAG) are inadequate for addressing this critical issue, particularly evident in our exploration of a specific medical dataset that epitomizes the distinct needs of FDoR-UL. To tackle this challenge, we introduce a Sequential Fusion method to integrate knowledge from complex contexts into LLMs. This method employs a two-stage framework: initially leveraging general LLMs to perform relation extraction for knowledge acquisition from complex texts, followed by updating domain-specific LLMs through Knowledge Editing (KE). Employing our method, domain-specific LLMs achieved a 71.7% accuracy (an average gain of 39.1%) in question-answering tasks. Furthermore, we expanded our evaluation to a novel economics-management dataset we developed, where our method achieved a 75.0% accuracy (an average gain of 45.0%). These findings underscore the effectiveness and flexibility of our approach in FDoR-UL across various domains.

General LLMs as Instructors for Domain-Specific LLMs: A Sequential Fusion Method to Integrate Extraction and Editing

TL;DR

A Sequential Fusion method to integrate knowledge from complex contexts into LLMs is introduced, which employs a two-stage framework: initially leveraging general LLMs to perform relation extraction for knowledge acquisition from complex texts, followed by updating domain-specific LLMs through Knowledge Editing (KE).

Abstract

The substantial interest in updating Large Language Models (LLMs) without retraining from scratch is accompanied by several challenges. This is particularly true when updating LLMs with datasets that necessitate domain-expert reasoning across extensive texts, despite limited samples. We termed the scenario as the Few-Shot Domain-Expert Reasoning for Updating LLMs (FDoR-UL). Traditional methods such as Low-Rank Adaptation (LoRA) and Retrieval Augmented Generation (RAG) are inadequate for addressing this critical issue, particularly evident in our exploration of a specific medical dataset that epitomizes the distinct needs of FDoR-UL. To tackle this challenge, we introduce a Sequential Fusion method to integrate knowledge from complex contexts into LLMs. This method employs a two-stage framework: initially leveraging general LLMs to perform relation extraction for knowledge acquisition from complex texts, followed by updating domain-specific LLMs through Knowledge Editing (KE). Employing our method, domain-specific LLMs achieved a 71.7% accuracy (an average gain of 39.1%) in question-answering tasks. Furthermore, we expanded our evaluation to a novel economics-management dataset we developed, where our method achieved a 75.0% accuracy (an average gain of 45.0%). These findings underscore the effectiveness and flexibility of our approach in FDoR-UL across various domains.
Paper Structure (41 sections, 7 equations, 7 figures, 4 tables)

This paper contains 41 sections, 7 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: An illustration of our proposed Sequential Fusion method, showcasing its comparison with traditional approaches. Our method employs a two-stage framework: First, general LLMs are leveraged to perform relation extraction for extracting structured knowledge from complex texts. Second, the extracted knowledge is used to update domain-specific LLMs through knowledge editing.
  • Figure 2: An overview of our prompt strategy, organized into the INSTRUCTION, REASON, FORMAT, and TIPS modules, for the relation extraction task.
  • Figure 3: A workflow diagram illustrating the process of updating domain-specific LLMs through Sequential Fusion. This includes extracting structured knowledge, converting it into natural language, and editing the LLMs using the IKE method.
  • Figure 4: The Process of the SKT Module. This diagram illustrates how structured knowledge is converted into natural language through the Mapping and Semantic Integration stages.
  • Figure 5: A comparison chart illustrating the question-answering accuracy of LoRA, RAG, and Sequential Fusion on the DCE and MEE of Llama2 7b and Qwen 7b.
  • ...and 2 more figures