Table of Contents
Fetching ...

STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models

Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, Wei Wang

TL;DR

Star tackles data scarcity in low-resource information extraction by flipping data generation: first generate target output structures $Y$, then synthesize passages $X$ that realize them, guided by fine-grained LLM prompts and a self-refinement loop. The approach combines target-structure generation, instruction-guided passage creation, and self-refinement, with adaptation to relation extraction. Across ACE05 EE and TACRED RE, STAR-generated data significantly boosts supervised model performance, often rivaling or surpassing human-curated data while maintaining high quality per human assessment. Ablation analyses identify diverse, balanced target structures and robust self-refinement as key drivers of gains, providing practical guidance for leveraging LLMs to create training data for complex IE tasks.

Abstract

Information extraction tasks such as event extraction require an in-depth understanding of the output structure and sub-task dependencies. They heavily rely on task-specific training data in the form of (passage, target structure) pairs to obtain reasonable performance. However, obtaining such data through human annotation is costly, leading to a pressing need for low-resource information extraction approaches that require minimal human labeling for real-world applications. Fine-tuning supervised models with synthesized training data would be a generalizable method, but the existing data generation methods either still rely on large-scale ground-truth data or cannot be applied to complicated IE tasks due to their poor performance. To address these challenges, we propose STAR, a data generation method that leverages Large Language Models (LLMs) to synthesize data instances given limited seed demonstrations, thereby boosting low-resource information extraction performance. Our approach involves generating target structures (Y) followed by generating passages (X), all accomplished with the aid of LLMs. We design fine-grained step-by-step instructions to obtain the initial data instances. We further reduce errors and improve data quality through self-reflection error identification and self-refinement with iterative revision. Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks, even surpassing the effectiveness of human-curated data. Human assessment of the data quality shows STAR-generated data exhibits higher passage quality and better align with the task definitions compared with the human-curated data.

STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models

TL;DR

Star tackles data scarcity in low-resource information extraction by flipping data generation: first generate target output structures , then synthesize passages that realize them, guided by fine-grained LLM prompts and a self-refinement loop. The approach combines target-structure generation, instruction-guided passage creation, and self-refinement, with adaptation to relation extraction. Across ACE05 EE and TACRED RE, STAR-generated data significantly boosts supervised model performance, often rivaling or surpassing human-curated data while maintaining high quality per human assessment. Ablation analyses identify diverse, balanced target structures and robust self-refinement as key drivers of gains, providing practical guidance for leveraging LLMs to create training data for complex IE tasks.

Abstract

Information extraction tasks such as event extraction require an in-depth understanding of the output structure and sub-task dependencies. They heavily rely on task-specific training data in the form of (passage, target structure) pairs to obtain reasonable performance. However, obtaining such data through human annotation is costly, leading to a pressing need for low-resource information extraction approaches that require minimal human labeling for real-world applications. Fine-tuning supervised models with synthesized training data would be a generalizable method, but the existing data generation methods either still rely on large-scale ground-truth data or cannot be applied to complicated IE tasks due to their poor performance. To address these challenges, we propose STAR, a data generation method that leverages Large Language Models (LLMs) to synthesize data instances given limited seed demonstrations, thereby boosting low-resource information extraction performance. Our approach involves generating target structures (Y) followed by generating passages (X), all accomplished with the aid of LLMs. We design fine-grained step-by-step instructions to obtain the initial data instances. We further reduce errors and improve data quality through self-reflection error identification and self-refinement with iterative revision. Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks, even surpassing the effectiveness of human-curated data. Human assessment of the data quality shows STAR-generated data exhibits higher passage quality and better align with the task definitions compared with the human-curated data.
Paper Structure (35 sections, 2 figures, 4 tables)

This paper contains 35 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: The Star inverse data generation strategy using event extraction task as an example. We first generate target structures from valid trigger and argument candidates. Then we prompt the LLM with task instructions from different task granularities to generate the initial passage $X_0$ containing the event information in the given target structure $Y$. Finally, we create self-reflection questions to prompt LLM to identify quality issues automatically and refine the passage with template-based hindsight feedback.
  • Figure 2: Event extraction performance (F1, %) when the EE models are trained on $N$ augmented training data on top of 10 data points ($k=10$) for each event type. We observe that performance gain brought by Star-generated data is magnified as the data augmentation scales up with a larger $N$, and data generated by Star is even more effective than human-curated ones. We use GPT-3.5 version Star for this set of experiments.