Targeted Augmentation for Low-Resource Event Extraction
Sijia Wang, Lifu Huang
TL;DR
This work addresses the scarcity of labeled data in event extraction by proposing a targeted augmentation framework (Talor-EE) that, together with back-validation, augments data with diverse, polarity-aware, accurate, and coherent event mentions. It combines dependent-context retrieval from external corpora, targeted generation to enrich event structures, negative augmentation to capture non-events, a back-validation loop with entailment and coherence checks, and a robust fine-tuning regime using a DEGREE-based generative EE model. Extensive experiments on ACE05-E and ERE under zero- and few-shot settings show consistent improvements across multiple backbones and generation agents, with diversity—measured by richer argument-role coverage—driving the largest gains. Limitations include potential over-generation of non-events and challenges in fine-grained argument-role distinctions, suggesting avenues for further refinement and broader application.
Abstract
Addressing the challenge of low-resource information extraction remains an ongoing issue due to the inherent information scarcity within limited training examples. Existing data augmentation methods, considered potential solutions, struggle to strike a balance between weak augmentation (e.g., synonym augmentation) and drastic augmentation (e.g., conditional generation without proper guidance). This paper introduces a novel paradigm that employs targeted augmentation and back validation to produce augmented examples with enhanced diversity, polarity, accuracy, and coherence. Extensive experimental results demonstrate the effectiveness of the proposed paradigm. Furthermore, identified limitations are discussed, shedding light on areas for future improvement.
