Table of Contents
Fetching ...

Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages

Tadesse Destaw Belay, Dawit Ketema Gete, Abinew Ali Ayele, Olga Kolesnikova, Grigori Sidorov, Seid Muhie Yimam

TL;DR

This work tackles multi-label emotion analysis with intensity in four Ethiopian languages by extending the EthioEmo dataset with per-emotion intensity annotations and benchmarking a spectrum of multilingual PLMs and open-source LLMs on three tasks: monolingual multi-label classification, emotion intensity prediction, and cross-lingual classification. It demonstrates that African-language–centric PLMs, especially AfroXLM-R variants, yield state-of-the-art performance for both within-language and cross-language settings, while large language models underperform on low-resource Ethiopian languages for intensity tasks. The study also highlights that cross-lingual transfer is feasible but generally weaker than monolingual results, with script similarity aiding transfer between Amharic/Tigrinya. The annotated intensity data and cross-lingual insights have practical implications for fine-grained emotion analysis in low-resource languages, including applications in mental health and social media analytics, and set the stage for annotator-level modeling and broader ethical considerations in future work.

Abstract

In this digital world, people freely express their emotions using different social media platforms. As a result, modeling and integrating emotion-understanding models are vital for various human-computer interaction tasks such as decision-making, product and customer feedback analysis, political promotions, marketing research, and social media monitoring. As users express different emotions simultaneously in a single instance, annotating emotions in a multilabel setting such as the EthioEmo (Belay et al., 2025) dataset effectively captures this dynamic. Additionally, incorporating intensity, or the degree of emotion, is crucial, as emotions can significantly differ in their expressive strength and impact. This intensity is significant for assessing whether further action is necessary in decision-making processes, especially concerning negative emotions in applications such as healthcare and mental health studies. To enhance the EthioEmo dataset, we include annotations for the intensity of each labeled emotion. Furthermore, we evaluate various state-of-the-art encoder-only Pretrained Language Models (PLMs) and decoder-only Large Language Models (LLMs) to provide comprehensive benchmarking.

Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages

TL;DR

This work tackles multi-label emotion analysis with intensity in four Ethiopian languages by extending the EthioEmo dataset with per-emotion intensity annotations and benchmarking a spectrum of multilingual PLMs and open-source LLMs on three tasks: monolingual multi-label classification, emotion intensity prediction, and cross-lingual classification. It demonstrates that African-language–centric PLMs, especially AfroXLM-R variants, yield state-of-the-art performance for both within-language and cross-language settings, while large language models underperform on low-resource Ethiopian languages for intensity tasks. The study also highlights that cross-lingual transfer is feasible but generally weaker than monolingual results, with script similarity aiding transfer between Amharic/Tigrinya. The annotated intensity data and cross-lingual insights have practical implications for fine-grained emotion analysis in low-resource languages, including applications in mental health and social media analytics, and set the stage for annotator-level modeling and broader ethical considerations in future work.

Abstract

In this digital world, people freely express their emotions using different social media platforms. As a result, modeling and integrating emotion-understanding models are vital for various human-computer interaction tasks such as decision-making, product and customer feedback analysis, political promotions, marketing research, and social media monitoring. As users express different emotions simultaneously in a single instance, annotating emotions in a multilabel setting such as the EthioEmo (Belay et al., 2025) dataset effectively captures this dynamic. Additionally, incorporating intensity, or the degree of emotion, is crucial, as emotions can significantly differ in their expressive strength and impact. This intensity is significant for assessing whether further action is necessary in decision-making processes, especially concerning negative emotions in applications such as healthcare and mental health studies. To enhance the EthioEmo dataset, we include annotations for the intensity of each labeled emotion. Furthermore, we evaluate various state-of-the-art encoder-only Pretrained Language Models (PLMs) and decoder-only Large Language Models (LLMs) to provide comprehensive benchmarking.

Paper Structure

This paper contains 16 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Multi-label emotion in Ethiopian languages (EthioEmo) with its level of intensity. The languages of each text are in brackets.
  • Figure 2: Emotion co-occurrence across the six basic emotions and languages
  • Figure 3: Emotion intensity statistics across emotion label with three intensity levels (low, medium, and high of the corresponding emotion). Instances that have not been labeled in any of the given emotions are not included in the statistics, no emotion instances are for Amharic 1021, Oromo 1357, Somali 2156, and Tigrinya 1336.
  • Figure 4: Emotion statistics in number emotion labels for each instance