Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

Zepeng Ding; Wenhao Huang; Jiaqing Liang; Deqing Yang; Yanghua Xiao

Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

Zepeng Ding, Wenhao Huang, Jiaqing Liang, Deqing Yang, Yanghua Xiao

TL;DR

The paper tackles the challenge of low recall in large language models when extracting relational triples from complex sentences. It introduces a two-stage evaluation-filtering framework in which a lightweight evaluation model generates high-precision candidate entity pairs that are embedded into prompts to guide LLM-based extraction, complemented by parameter-efficient fine-tuning (LoRA) for the LLMs. The evaluation model uses a Transformer-based encoder and a 2D decoder to produce a token-pair scoring matrix $A'$, from which a pair score $F(s,o)$ is derived to select positive entity pairs via $F(s,o) > 0$. Across multiple datasets (NYT series, Wiki-KBP, SKE21, HacRED) and several LLMs, the approach consistently improves recall for complex sentences while maintaining or improving precision, and ablations confirm the necessity of both stages and the filtering mechanism. The mechanism is also applicable to traditional small extraction models, highlighting its practical impact for robust knowledge graph construction and information extraction tasks.

Abstract

Relation triple extraction, which outputs a set of triples from long sentences, plays a vital role in knowledge acquisition. Large language models can accurately extract triples from simple sentences through few-shot learning or fine-tuning when given appropriate instructions. However, they often miss out when extracting from complex sentences. In this paper, we design an evaluation-filtering framework that integrates large language models with small models for relational triple extraction tasks. The framework includes an evaluation model that can extract related entity pairs with high precision. We propose a simple labeling principle and a deep neural network to build the model, embedding the outputs as prompts into the extraction process of the large model. We conduct extensive experiments to demonstrate that the proposed method can assist large language models in obtaining more accurate extraction results, especially from complex sentences containing multiple relational triples. Our evaluation model can also be embedded into traditional extraction models to enhance their extraction precision from complex sentences.

Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

TL;DR

, from which a pair score

is derived to select positive entity pairs via

. Across multiple datasets (NYT series, Wiki-KBP, SKE21, HacRED) and several LLMs, the approach consistently improves recall for complex sentences while maintaining or improving precision, and ablations confirm the necessity of both stages and the filtering mechanism. The mechanism is also applicable to traditional small extraction models, highlighting its practical impact for robust knowledge graph construction and information extraction tasks.

Abstract

Paper Structure (24 sections, 5 equations, 5 figures, 5 tables)

This paper contains 24 sections, 5 equations, 5 figures, 5 tables.

Introduction
Related Works
Large Language Models for Relational Triple Extraction
Model Collaboration in the Era of Large Language Models
Methods
Solution Framework
Basic Idea of Evaluation Model
Self Labeling
Evaluation Model Structure
Encoder
2-dim Decoder
Loss Function
Candidate Pairs Evaluation
Parameter-Efficient Fine-Tuning for LLMs
Instruction Template
...and 9 more sections

Figures (5)

Figure 1: (a) Illustration of multiple relational triple extraction by LLMs, based on ChatGPT or Vicuna-13B. Both models are given appropriate instructions, limited predicates list and asked to extract as many as possible. (b) Compelling LLM to generate more triples results in repetitive outputs.
Figure 2: Model framework. On the bottom left is an arbitrary entity-extraction model. On the bottom right is our evaluation model, which outputs a token pair scoring matrix.
Figure 3: An example of the workflow of our Evaluation-Filtering method.
Figure 4: This sentence contains 6 entity pairs, but only 2 pairs are positive.
Figure 5: Recall and F1-score curve of Qwen-7B (w/ peft) on NYT10, with and without our evaluation-filtering method. Minimum # of triples means we only consider sentences that contain a number of triples greater than this value. Note that the coordinates do not start from 0.

Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

TL;DR

Abstract

Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

Authors

TL;DR

Abstract

Table of Contents

Figures (5)