Table of Contents
Fetching ...

Evaluating and explaining training strategies for zero-shot cross-lingual news sentiment analysis

Luka Andrenšek, Boshko Koloski, Andraž Pelicon, Nada Lavrač, Senja Pollak, Matthew Purver

TL;DR

This work investigates zero-shot cross-lingual news sentiment detection, aiming to develop robust sentiment classifiers that can be deployed across multiple languages without target-language training data, and shows that language similarity is not in itself sufficient for predicting the success of cross-lingual transfer.

Abstract

We investigate zero-shot cross-lingual news sentiment detection, aiming to develop robust sentiment classifiers that can be deployed across multiple languages without target-language training data. We introduce novel evaluation datasets in several less-resourced languages, and experiment with a range of approaches including the use of machine translation; in-context learning with large language models; and various intermediate training regimes including a novel task objective, POA, that leverages paragraph-level information. Our results demonstrate significant improvements over the state of the art, with in-context learning generally giving the best performance, but with the novel POA approach giving a competitive alternative with much lower computational overhead. We also show that language similarity is not in itself sufficient for predicting the success of cross-lingual transfer, but that similarity in semantic content and structure can be equally important.

Evaluating and explaining training strategies for zero-shot cross-lingual news sentiment analysis

TL;DR

This work investigates zero-shot cross-lingual news sentiment detection, aiming to develop robust sentiment classifiers that can be deployed across multiple languages without target-language training data, and shows that language similarity is not in itself sufficient for predicting the success of cross-lingual transfer.

Abstract

We investigate zero-shot cross-lingual news sentiment detection, aiming to develop robust sentiment classifiers that can be deployed across multiple languages without target-language training data. We introduce novel evaluation datasets in several less-resourced languages, and experiment with a range of approaches including the use of machine translation; in-context learning with large language models; and various intermediate training regimes including a novel task objective, POA, that leverages paragraph-level information. Our results demonstrate significant improvements over the state of the art, with in-context learning generally giving the best performance, but with the novel POA approach giving a competitive alternative with much lower computational overhead. We also show that language similarity is not in itself sufficient for predicting the success of cross-lingual transfer, but that similarity in semantic content and structure can be equally important.
Paper Structure (28 sections, 9 equations, 6 figures, 4 tables)

This paper contains 28 sections, 9 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Diagram illustrating the selected and proposed training strategies. The first approach, PSE (Section 4.3.1), is based on paragraph-level sentiment analysis. The second approach, POA (Section 4.3.2), introduces the newly proposed method that includes additional information about sentiment positions. Alternatively, the diagram shows that one can entirely skip the intermediate training phase and focus solely on fine-tuning the base XLMR weights for the main task, which in this case is document-level sentiment training in Slovenian.
  • Figure 2: Representation of language performance of each method. For each language, the gold-coloured bars represent the best method, followed by the second-best in silver and the third-best in bronze.
  • Figure 3: Optimal transport dataset distance between the Slovenian dataset (source) and the target datasets (destination)
  • Figure 4: Correlation of the OTDD metric and the performance of each metric across langauges.
  • Figure 5: Topic distribution in the original Slovenian datasets.
  • ...and 1 more figures