Label Drop for Multi-Aspect Relation Modeling in Universal Information Extraction
Lu Yang, Jiajia Li, En Ci, Lefei Zhang, Zuchao Li, Ping Wang
TL;DR
LDNet addresses universal information extraction by enabling simultaneous multi-relational extraction with three relation types $TA$, $A2A$, and $AS$ through multi-aspect relation modeling and a label-drop mechanism. It fuses text and image representations with RoPE-based global pointers to generate relation-specific score matrices $S^r$ and uses model transfer learning to propagate knowledge across datasets, optimizing with a combined loss $L = L_{MR} + L_{LD} + L_{MT}$. The approach achieves state-of-the-art or competitive results on 9 tasks and 33 datasets across single-modal and multi-modal settings, including few-shot and zero-shot regimes, even with smaller pretrained backbones. Limitations include data scarcity for extensive multi-modal pre-training and potential document-level extraction challenges, accompanied by ethical considerations for privacy and data usage.
Abstract
Universal Information Extraction (UIE) has garnered significant attention due to its ability to address model explosion problems effectively. Extractive UIE can achieve strong performance using a relatively small model, making it widely adopted. Extractive UIEs generally rely on task instructions for different tasks, including single-target instructions and multiple-target instructions. Single-target instruction UIE enables the extraction of only one type of relation at a time, limiting its ability to model correlations between relations and thus restricting its capability to extract complex relations. While multiple-target instruction UIE allows for the extraction of multiple relations simultaneously, the inclusion of irrelevant relations introduces decision complexity and impacts extraction accuracy. Therefore, for multi-relation extraction, we propose LDNet, which incorporates multi-aspect relation modeling and a label drop mechanism. By assigning different relations to different levels for understanding and decision-making, we reduce decision confusion. Additionally, the label drop mechanism effectively mitigates the impact of irrelevant relations. Experiments show that LDNet outperforms or achieves competitive performance with state-of-the-art systems on 9 tasks, 33 datasets, in both single-modal and multi-modal, few-shot and zero-shot settings.\footnote{https://github.com/Lu-Yang666/LDNet}
