Debate as Optimization: Adaptive Conformal Prediction and Diverse Retrieval for Event Extraction
Sijia Wang, Lifu Huang
TL;DR
By reframing event extraction as an optimization problem solvable via multi-agent debate, the paper addresses the tuning-data gap in LLM-based EE. The Debating-as-Optimization framework combines four agent roles with two modules: Diverse-RAG for adaptive retrieval and AdaCP for progressive rejection of weak answers, enabling tuning-free improvement on ED and EAE. Empirical results on ACE05-E and CASIE show the approach narrows the gap to tuning-based methods by substantial margins and demonstrates robust performance across ED, EAE, and EE. The work highlights the practicality of domain-adaptive, calibration-guided retrieval in EE and suggests future work to optimize efficiency and scalability.
Abstract
We propose a multi-agent debate as optimization (DAO) system for event extraction, where the primary objective is to iteratively refine the large language models (LLMs) outputs through debating without parameter tuning. In DAO, we introduce two novel modules: the Diverse-RAG (DRAG) module and the Adaptive Conformal Prediction (AdaCP) module. DRAG systematically retrieves supporting information that best fits the debate discussion, while AdaCP enhances the accuracy and reliability of event extraction by effectively rejecting less promising answers. Experimental results demonstrate a significant reduction in the performance gap between supervised approaches and tuning-free LLM-based methods by 18.1% and 17.8% on ACE05 and 17.9% and 15.2% on CASIE for event detection and argument extraction respectively.
