Table of Contents
Fetching ...

Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction

Yang Chen, Vedaant Shah, Alan Ritter

TL;DR

This paper introduces Translation-and-Fusion (TransFusion), a framework that augments instruction-tuned IE by leveraging external machine translation to English and fusing English annotations with target-language data. Building on this, GoLLIE-TF, a cross-lingual instruction-tuned IE model, demonstrates substantial zero-shot cross-lingual gains across 50 languages, especially for low-resource languages, outperforming the base GoLLIE and improving GPT-4 and encoder-only models under TransFusion prompts or fusion. The approach relies on a small, cross-lingual training set derived from translating English IE data and projecting labels, enabling robust cross-language transfer while maintaining English performance. Across MasakhaNER2.0, UNER, and other multilingual benchmarks, TransFusion yields notable $F1$ gains and shows resilience to translation quality, offering a practical path to more inclusive IE for low-resource languages despite additional inference costs. Overall, the work demonstrates that strategic translation and fusion can unlock strong cross-lingual IE performance with existing multilingual LLMs, broadening the impact of IE research across diverse linguistic communities.

Abstract

Large language models (LLMs) combined with instruction tuning have shown significant progress in information extraction (IE) tasks, exhibiting strong generalization capabilities to unseen datasets by following annotation guidelines. However, their applicability to low-resource languages remains limited due to lack of both labeled data for fine-tuning, and unlabeled text for pre-training. In this paper, we propose TransFusion, a framework in which models are fine-tuned to use English translations of low-resource language data, enabling more precise predictions through annotation fusion. Based on TransFusion, we introduce GoLLIE-TF, a cross-lingual instruction-tuned LLM for IE tasks, designed to close the performance gap between high and low-resource languages. Our experiments across twelve multilingual IE datasets spanning 50 languages demonstrate that GoLLIE-TF achieves better zero-shot cross-lingual transfer over the base model. In addition, we show that TransFusion significantly improves low-resource language named entity recognition when applied to proprietary models such as GPT-4 (+5 F1) with a prompting approach, or fine-tuning different language models including decoder-only (+14 F1) and encoder-only (+13 F1) architectures.

Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction

TL;DR

This paper introduces Translation-and-Fusion (TransFusion), a framework that augments instruction-tuned IE by leveraging external machine translation to English and fusing English annotations with target-language data. Building on this, GoLLIE-TF, a cross-lingual instruction-tuned IE model, demonstrates substantial zero-shot cross-lingual gains across 50 languages, especially for low-resource languages, outperforming the base GoLLIE and improving GPT-4 and encoder-only models under TransFusion prompts or fusion. The approach relies on a small, cross-lingual training set derived from translating English IE data and projecting labels, enabling robust cross-language transfer while maintaining English performance. Across MasakhaNER2.0, UNER, and other multilingual benchmarks, TransFusion yields notable gains and shows resilience to translation quality, offering a practical path to more inclusive IE for low-resource languages despite additional inference costs. Overall, the work demonstrates that strategic translation and fusion can unlock strong cross-lingual IE performance with existing multilingual LLMs, broadening the impact of IE research across diverse linguistic communities.

Abstract

Large language models (LLMs) combined with instruction tuning have shown significant progress in information extraction (IE) tasks, exhibiting strong generalization capabilities to unseen datasets by following annotation guidelines. However, their applicability to low-resource languages remains limited due to lack of both labeled data for fine-tuning, and unlabeled text for pre-training. In this paper, we propose TransFusion, a framework in which models are fine-tuned to use English translations of low-resource language data, enabling more precise predictions through annotation fusion. Based on TransFusion, we introduce GoLLIE-TF, a cross-lingual instruction-tuned LLM for IE tasks, designed to close the performance gap between high and low-resource languages. Our experiments across twelve multilingual IE datasets spanning 50 languages demonstrate that GoLLIE-TF achieves better zero-shot cross-lingual transfer over the base model. In addition, we show that TransFusion significantly improves low-resource language named entity recognition when applied to proprietary models such as GPT-4 (+5 F1) with a prompting approach, or fine-tuning different language models including decoder-only (+14 F1) and encoder-only (+13 F1) architectures.
Paper Structure (27 sections, 5 equations, 19 figures, 10 tables)

This paper contains 27 sections, 5 equations, 19 figures, 10 tables.

Figures (19)

  • Figure 1: Our TransFusion framework aims to bridge the performance gap between high and low-resource languages on information extraction tasks. (left) TransFusion reasoning includes three steps: translate, annotate, and fuse. (right) GoLLIE-TF shows superior zero-shot cross-lingual evaluation on a range of IE datasets over the base model.
  • Figure 2: TransFusion leads to larger NER F1 improvements for low resource languages in MasakhaNER2 (right) compared to high resource languages in UNER (left).
  • Figure 3: TransFusion robustness to different translation systems.
  • Figure 4: GPT-4 + TransFusion framework improves NER on low-resource language from MasakhaNER2 and UNER subsets. On average, GPT-4 + TransFusion improves average F1 from 53.4 to 62.
  • Figure 5: Error analysis of GoLLIE-TF's 31 incorrect predictions on MasakhaNER2 (Akan). Two common errors are categorized as English prediction error (22/31) and fusion error (12/31).
  • ...and 14 more figures