Table of Contents
Fetching ...

Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

Guozheng Li, Peng Wang, Wenjun Ke, Yikai Guo, Ke Ji, Ziyu Shang, Jiajun Liu, Zijie Xu

TL;DR

The paper tackles the weak in-context learning performance of LLMs on relation extraction by addressing two key bottlenecks: selecting relevant demonstrations and enabling strong in-context reasoning. It introduces RE^4, a recall-retrieve-reason framework that distills ontological knowledge to generate valid entity-pair queries, retrieves demonstrations from a corpus, and performs in-context reasoning to predict relations, all while being trainable with LoRA. Through joint optimization of recalling and reasoning, and extensive experiments on four RE benchmarks with open-source LLMs, RE^4 achieves competitive or state-of-the-art results without large-scale pretraining. The approach demonstrates the practical potential of combining ontology-aware recall with retrieval-augmented ICL for robust, sentence-level RE, with broad implications for deploying open-source models in information extraction tasks.

Abstract

Relation extraction (RE) aims to identify relations between entities mentioned in texts. Although large language models (LLMs) have demonstrated impressive in-context learning (ICL) abilities in various tasks, they still suffer from poor performances compared to most supervised fine-tuned RE methods. Utilizing ICL for RE with LLMs encounters two challenges: (1) retrieving good demonstrations from training examples, and (2) enabling LLMs exhibit strong ICL abilities in RE. On the one hand, retrieving good demonstrations is a non-trivial process in RE, which easily results in low relevance regarding entities and relations. On the other hand, ICL with an LLM achieves poor performance in RE while RE is different from language modeling in nature or the LLM is not large enough. In this work, we propose a novel recall-retrieve-reason RE framework that synergizes LLMs with retrieval corpora (training examples) to enable relevant retrieving and reliable in-context reasoning. Specifically, we distill the consistently ontological knowledge from training datasets to let LLMs generate relevant entity pairs grounded by retrieval corpora as valid queries. These entity pairs are then used to retrieve relevant training examples from the retrieval corpora as demonstrations for LLMs to conduct better ICL via instruction tuning. Extensive experiments on different LLMs and RE datasets demonstrate that our method generates relevant and valid entity pairs and boosts ICL abilities of LLMs, achieving competitive or new state-of-the-art performance on sentence-level RE compared to previous supervised fine-tuning methods and ICL-based methods.

Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

TL;DR

The paper tackles the weak in-context learning performance of LLMs on relation extraction by addressing two key bottlenecks: selecting relevant demonstrations and enabling strong in-context reasoning. It introduces RE^4, a recall-retrieve-reason framework that distills ontological knowledge to generate valid entity-pair queries, retrieves demonstrations from a corpus, and performs in-context reasoning to predict relations, all while being trainable with LoRA. Through joint optimization of recalling and reasoning, and extensive experiments on four RE benchmarks with open-source LLMs, RE^4 achieves competitive or state-of-the-art results without large-scale pretraining. The approach demonstrates the practical potential of combining ontology-aware recall with retrieval-augmented ICL for robust, sentence-level RE, with broad implications for deploying open-source models in information extraction tasks.

Abstract

Relation extraction (RE) aims to identify relations between entities mentioned in texts. Although large language models (LLMs) have demonstrated impressive in-context learning (ICL) abilities in various tasks, they still suffer from poor performances compared to most supervised fine-tuned RE methods. Utilizing ICL for RE with LLMs encounters two challenges: (1) retrieving good demonstrations from training examples, and (2) enabling LLMs exhibit strong ICL abilities in RE. On the one hand, retrieving good demonstrations is a non-trivial process in RE, which easily results in low relevance regarding entities and relations. On the other hand, ICL with an LLM achieves poor performance in RE while RE is different from language modeling in nature or the LLM is not large enough. In this work, we propose a novel recall-retrieve-reason RE framework that synergizes LLMs with retrieval corpora (training examples) to enable relevant retrieving and reliable in-context reasoning. Specifically, we distill the consistently ontological knowledge from training datasets to let LLMs generate relevant entity pairs grounded by retrieval corpora as valid queries. These entity pairs are then used to retrieve relevant training examples from the retrieval corpora as demonstrations for LLMs to conduct better ICL via instruction tuning. Extensive experiments on different LLMs and RE datasets demonstrate that our method generates relevant and valid entity pairs and boosts ICL abilities of LLMs, achieving competitive or new state-of-the-art performance on sentence-level RE compared to previous supervised fine-tuning methods and ICL-based methods.
Paper Structure (22 sections, 11 equations, 5 figures, 5 tables)

This paper contains 22 sections, 11 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Comparison between naive demonstration selection (left) and our demonstration selection (right) methods. Different colors represent different relations between entity pairs, while green represents the golden relation expressed by entity pairs in test example.
  • Figure 2: Illustration of the RE$^4$ framework. Given a test example, we first prompt LLMs to generate several relevant entity pairs that are grounded by retrieval corpora as queries. Then we retrieve demonstrations from training examples using the queries. Finally, we conduct in-context reasoning based on the retrieved entities and relations. The instructions in prompts are marked with underline, and the outputs of LLMs in prompts are highlighted in blue.
  • Figure 3: The sensitivity of $k$. ICL denotes we perform in-context reasoning in reasoning module with retrieved demonstrations, while Majority Vote denotes we consider the relation with the maximum number of generated entity pairs as the predicted relation.
  • Figure 4: Comparison on different retrieval models.
  • Figure 5: Sensitivity of in-context reasoning.