Large Language Models in Targeted Sentiment Analysis

Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch

Large Language Models in Targeted Sentiment Analysis

Nicolay Rusnachenko, Anton Golubev, Natalia Loukachevitch

TL;DR

The paper addresses targeted sentiment analysis toward named entities in Russian news by comparing zero-shot, instruction-tuned LLMs and THoR-based fine-tuning of Flan-T5 on RuSentNE-2023 and its English translation. It shows that zero-shot approaches can match encoder-based baselines, while THoR fine-tuning significantly boosts performance, with Flan-T5-xl achieving the top score of $F1^{PN}=68.20$. Translation to English generally improves model performance, underscoring language familiarity in LLM reasoning. Overall, the work demonstrates the viability of decoder-based LLMs for Russian TSA and provides a publicly accessible THoR framework for sentiment analysis research.

Abstract

In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles. We study sentiment analysis capabilities of instruction-tuned large language models (LLMs). We consider the dataset of RuSentNE-2023 in our study. The first group of experiments was aimed at the evaluation of zero-shot capabilities of LLMs with closed and open transparencies. The second covers the fine-tuning of Flan-T5 using the "chain-of-thought" (CoT) three-hop reasoning framework (THoR). We found that the results of the zero-shot approaches are similar to the results achieved by baseline fine-tuned encoder-based transformers (BERT-base). Reasoning capabilities of the fine-tuned Flan-T5 models with THoR achieve at least 5% increment with the base-size model compared to the results of the zero-shot experiment. The best results of sentiment analysis on RuSentNE-2023 were achieved by fine-tuned Flan-T5-xl, which surpassed the results of previous state-of-the-art transformer-based classifiers. Our CoT application framework is publicly available: https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework

Large Language Models in Targeted Sentiment Analysis

TL;DR

. Translation to English generally improves model performance, underscoring language familiarity in LLM reasoning. Overall, the work demonstrates the viability of decoder-based LLMs for Russian TSA and provides a publicly accessible THoR framework for sentiment analysis research.

Abstract

Paper Structure (9 sections, 2 figures, 4 tables)

This paper contains 9 sections, 2 figures, 4 tables.

Introduction
Related Work
RuSentNE-2023 Evaluation and Dataset
Experimental Setup
LLMs Zero-shot Experiments Setup
LLMs Fine-tuning Setup
Results and Discussion
Error Analysis
Conclusion

Figures (2)

Figure 1: Inferring sentiment $s'$ using CoT three-hop reasoning framework (THoR), including <<final label inferring>> to answer one of the task classes FeiAcl23THOR
Figure 2: Analysis of the Flan-T5 models results on RuSentNE-2023endev per each epoch (horizontal axis) by $F_1(PN)$ (vertical axis) during fine-tuning with PROMPT (left) and THoR technique (right) per different sizes

Large Language Models in Targeted Sentiment Analysis

TL;DR

Abstract

Large Language Models in Targeted Sentiment Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (2)