Towards Realistic Few-Shot Relation Extraction: A New Meta Dataset and Evaluation
Fahmida Alam, Md Asiful Islam, Robert Vacareanu, Mihai Surdeanu
TL;DR
This work constructs a realistic FSRE meta-dataset by combining NYT29-derived FS, WIKIDATA-derived FS, and a few-shot TACRED variant, then applies a standardized supervised-to-few-shot transformation to place test relations outside the background set with scarce training examples and NOTA as a unified negative label. It conducts a comprehensive evaluation of six FSRE methods across 5-way 1-shot and 5-shot episodes, demonstrating that no single approach consistently outperforms others and that overall FSRE performance remains low, especially on WIKIDATA due to long-tail entity distributions. The dataset and evaluation protocol reveal significant variability across datasets and emphasize the need for robust, generalizable FSRE approaches; all data versions are released to spur future research. This work also situates FSRE within a realism-aware context, arguing for more realistic benchmarks and multi-split evaluations to avoid overestimating generalization.
Abstract
We introduce a meta dataset for few-shot relation extraction, which includes two datasets derived from existing supervised relation extraction datasets NYT29 (Takanobu et al., 2019; Nayak and Ng, 2020) and WIKIDATA (Sorokin and Gurevych, 2017) as well as a few-shot form of the TACRED dataset (Sabo et al., 2021). Importantly, all these few-shot datasets were generated under realistic assumptions such as: the test relations are different from any relations a model might have seen before, limited training data, and a preponderance of candidate relation mentions that do not correspond to any of the relations of interest. Using this large resource, we conduct a comprehensive evaluation of six recent few-shot relation extraction methods, and observe that no method comes out as a clear winner. Further, the overall performance on this task is low, indicating substantial need for future research. We release all versions of the data, i.e., both supervised and few-shot, for future research.
