Extraction multi-étiquettes de relations en utilisant des couches de Transformer
Ngoc Luyen Le, Gildas Tagny Ngompé
TL;DR
This work tackles multi-label relation extraction in French intelligence reports, framing the task as predicting multiple relation labels for entity pairs within texts. It introduces BTransformer18, a architecture that fine-tunes CamemBERT-Large as a contextual body and attaches a Transformer-based head for relation classification, followed by mean-pooling for aggregation. On the TextMine'25 dataset of 800 reports, CamemBERT-Large achieves a macro F1 of 0.654, outperforming FlauBERT-Large at 0.620, demonstrating the benefit of French LM representations for complex relation extraction. The authors provide public code, underscoring the practical potential of combining language-model backbones with Transformer heads for structured information extraction in non-English texts.
Abstract
In this article, we present the BTransformer18 model, a deep learning architecture designed for multi-label relation extraction in French texts. Our approach combines the contextual representation capabilities of pre-trained language models from the BERT family - such as BERT, RoBERTa, and their French counterparts CamemBERT and FlauBERT - with the power of Transformer encoders to capture long-term dependencies between tokens. Experiments conducted on the dataset from the TextMine'25 challenge show that our model achieves superior performance, particularly when using CamemBERT-Large, with a macro F1 score of 0.654, surpassing the results obtained with FlauBERT-Large. These results demonstrate the effectiveness of our approach for the automatic extraction of complex relations in intelligence reports.
