Polarity Detection of Sustainable Development Goals in News Text
Andrea Cadeddu, Alessandro Chessa, Vincenzo De Leo, Gianni Fenu, Francesco Osborne, Diego Reforgiato Recupero, Angelo Salatino, Luca Secchi
TL;DR
This work tackles SDG polarity detection by introducing SDG-POD, a benchmark combining manually annotated and synthetically generated data to assess whether news text indicates progress toward a specific SDG. It systematically evaluates six open-source LLMs in zero-shot and fine-tuned regimes, showing that fine-tuning on synthetic data enhances robustness and reduces critical misclassifications, with QWQ-32B achieving top performance in several settings. The study demonstrates that data-enrichment techniques can mitigate domain-resource constraints and provides detailed per-SDG insights, underscoring the ongoing challenge of SDG polarity understanding in real-world texts. Overall, SDG-POD offers a practical, reproducible framework for advancing polarity-aware sustainability monitoring and informs model development for policy-relevant discourse analysis.
Abstract
The United Nations' Sustainable Development Goals (SDGs) provide a globally recognised framework for addressing critical societal, environmental, and economic challenges. Recent developments in natural language processing (NLP) and large language models (LLMs) have facilitated the automatic classification of textual data according to their relevance to specific SDGs. Nevertheless, in many applications, it is equally important to determine the directionality of this relevance; that is, to assess whether the described impact is positive, neutral, or negative. To tackle this challenge, we propose the novel task of SDG polarity detection, which assesses whether a text segment indicates progress toward a specific SDG or conveys an intention to achieve such progress. To support research in this area, we introduce SDG-POD, a benchmark dataset designed specifically for this task, combining original and synthetically generated data. We perform a comprehensive evaluation using six state-of-the-art large LLMs, considering both zero-shot and fine-tuned configurations. Our results suggest that the task remains challenging for the current generation of LLMs. Nevertheless, some fine-tuned models, particularly QWQ-32B, achieve good performance, especially on specific Sustainable Development Goals such as SDG-9 (Industry, Innovation and Infrastructure), SDG-12 (Responsible Consumption and Production), and SDG-15 (Life on Land). Furthermore, we demonstrate that augmenting the fine-tuning dataset with synthetically generated examples yields improved model performance on this task. This result highlights the effectiveness of data enrichment techniques in addressing the challenges of this resource-constrained domain. This work advances the methodological toolkit for sustainability monitoring and provides actionable insights into the development of efficient, high-performing polarity detection systems.
