Table of Contents
Fetching ...

AmalREC: A Dataset for Relation Extraction and Classification Leveraging Amalgamation of Large Language Models

Mansi, Pranshu Pandya, Mahek Bhavesh Vora, Soumya Bharadwaj, Ashish Anand

TL;DR

AmalREC addresses the limited relation diversity and biases in existing RE/RC datasets by proposing a robust, multi-stage, LLM-driven data generation pipeline. It introduces a novel Sentence Evaluation Index (SEI) and SEI-Ranker to assess and blend generations from 15 methods across template, encoder-decoder, decoder-only, fusion, and ECB strategies, producing a 204,399-sentence dataset spanning 255 relation types. The framework is evaluated against SOTA RC/RE baselines and various LLMs, showing improved relational coverage and competitive performance, while also highlighting challenges for current models with high relation cardinality. The work provides a reproducible, cost-aware methodology and opens avenues for more nuanced modeling of large relation spaces in downstream NLU tasks.

Abstract

Existing datasets for relation classification and extraction often exhibit limitations such as restricted relation types and domain-specific biases. This work presents a generic framework to generate well-structured sentences from given tuples with the help of Large Language Models (LLMs). This study has focused on the following major questions: (i) how to generate sentences from relation tuples, (ii) how to compare and rank them, (iii) can we combine strengths of individual methods and amalgamate them to generate an even bette quality of sentences, and (iv) how to evaluate the final dataset? For the first question, we employ a multifaceted 5-stage pipeline approach, leveraging LLMs in conjunction with template-guided generation. We introduce Sentence Evaluation Index(SEI) that prioritizes factors like grammatical correctness, fluency, human-aligned sentiment, accuracy, and complexity to answer the first part of the second question. To answer the second part of the second question, this work introduces a SEI-Ranker module that leverages SEI to select top candidate generations. The top sentences are then strategically amalgamated to produce the final, high-quality sentence. Finally, we evaluate our dataset on LLM-based and SOTA baselines for relation classification. The proposed dataset features 255 relation types, with 15K sentences in the test set and around 150k in the train set organized in, significantly enhancing relational diversity and complexity. This work not only presents a new comprehensive benchmark dataset for RE/RC task, but also compare different LLMs for generation of quality sentences from relational tuples.

AmalREC: A Dataset for Relation Extraction and Classification Leveraging Amalgamation of Large Language Models

TL;DR

AmalREC addresses the limited relation diversity and biases in existing RE/RC datasets by proposing a robust, multi-stage, LLM-driven data generation pipeline. It introduces a novel Sentence Evaluation Index (SEI) and SEI-Ranker to assess and blend generations from 15 methods across template, encoder-decoder, decoder-only, fusion, and ECB strategies, producing a 204,399-sentence dataset spanning 255 relation types. The framework is evaluated against SOTA RC/RE baselines and various LLMs, showing improved relational coverage and competitive performance, while also highlighting challenges for current models with high relation cardinality. The work provides a reproducible, cost-aware methodology and opens avenues for more nuanced modeling of large relation spaces in downstream NLU tasks.

Abstract

Existing datasets for relation classification and extraction often exhibit limitations such as restricted relation types and domain-specific biases. This work presents a generic framework to generate well-structured sentences from given tuples with the help of Large Language Models (LLMs). This study has focused on the following major questions: (i) how to generate sentences from relation tuples, (ii) how to compare and rank them, (iii) can we combine strengths of individual methods and amalgamate them to generate an even bette quality of sentences, and (iv) how to evaluate the final dataset? For the first question, we employ a multifaceted 5-stage pipeline approach, leveraging LLMs in conjunction with template-guided generation. We introduce Sentence Evaluation Index(SEI) that prioritizes factors like grammatical correctness, fluency, human-aligned sentiment, accuracy, and complexity to answer the first part of the second question. To answer the second part of the second question, this work introduces a SEI-Ranker module that leverages SEI to select top candidate generations. The top sentences are then strategically amalgamated to produce the final, high-quality sentence. Finally, we evaluate our dataset on LLM-based and SOTA baselines for relation classification. The proposed dataset features 255 relation types, with 15K sentences in the test set and around 150k in the train set organized in, significantly enhancing relational diversity and complexity. This work not only presents a new comprehensive benchmark dataset for RE/RC task, but also compare different LLMs for generation of quality sentences from relational tuples.
Paper Structure (38 sections, 1 equation, 5 figures, 9 tables)

This paper contains 38 sections, 1 equation, 5 figures, 9 tables.

Figures (5)

  • Figure 1: The 5-stage pipeline; Abbreviations in stage 2, Encoder-Decoder based models: T: Tacred, W: WebNLG based fine-tuning; Template based generation: G: Gold, S: Silver; ECB: Extended context based generation; Fusion: G: GPT 3.5, P: PaLM, L: LLaMa datasets based fine-tuning on Flan T5
  • Figure 2: Weights given to the quality parameters in the ranker module
  • Figure 3: Average sentence length by all the generation techniques
  • Figure 4: FlanT5 results on zeroshot, fewshot and finetuning, demonstrating the need for finetuning
  • Figure 5: Fusion technique results against decoder only and encoder-decoder models separately