Jetsons at FinNLP 2024: Towards Understanding the ESG Impact of a News Article using Transformer-based Models
Parag Pravin Dakle, Alolika Gon, Sihan Zha, Liang Wang, SaiKrishna Rallabandi, Preethi Raghavan
TL;DR
The paper analyzes multilingualTransformer-based and traditional NLP approaches for the ML-ESG-3 task, which aims to predict ESG impact duration and type for news articles across English, French, Korean, and Japanese. It combines baselines (TF-IDF with classical classifiers), data-noise reduction, multilingual fine-tuning (XLM-RoBERTa, Longformer), semi-supervised learning with pseudo-labels, translation to English, and ensemble strategies. Key findings show that transformer-based fine-tuning with data augmentation—especially self-training and translation-based methods—delivers strong results, with language-specific models achieving top performance for Korean, Japanese, and English, and ensembles offering robust improvements. The work demonstrates effective multilingual strategies for ESG impact inference, with practical implications for risk assessment, stakeholder communication, and regulatory analysis in finance and journalism.
Abstract
In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. The shared task focuses on predicting the duration and type of the ESG impact of a news article. The shared task dataset consists of 2,059 news titles and articles in English, French, Korean, and Japanese languages. For the impact duration classification task, we fine-tuned XLM-RoBERTa with a custom fine-tuning strategy and using self-training and DeBERTa-v3 using only English translations. These models individually ranked first on the leaderboard for Korean and Japanese and in an ensemble for the English language, respectively. For the impact type classification task, our XLM-RoBERTa model fine-tuned using a custom fine-tuning strategy ranked first for the English language.
