AHaSIS: Shared Task on Sentiment Analysis for Arabic Dialects
Maram Alharbi, Salmane Chafik, Saad Ezzini, Ruslan Mitkov, Tharindu Ranasinghe, Hansi Hettiarachchi
TL;DR
AHaSIS 2025 tackles sentiment analysis across Arabic dialects in the hospitality domain by providing a bi-dialect hotel-review corpus in Saudi Arabic and Moroccan Darija and evaluating diverse modelling approaches. The shared task emphasizes cross-dialect generalization under limited data, with external resources and transformer-based methods achieving top performance around an F1 of 0.81. The results highlight the value of dialect-aware preprocessing, domain-specific embeddings, and prompt-based strategies for low-resource dialects. This work establishes a practical benchmark and guidance for dialect-sensitive Arabic NLP in customer experience analytics.
Abstract
The hospitality industry in the Arab world increasingly relies on customer feedback to shape services, driving the need for advanced Arabic sentiment analysis tools. To address this challenge, the Sentiment Analysis on Arabic Dialects in the Hospitality Domain shared task focuses on Sentiment Detection in Arabic Dialects. This task leverages a multi-dialect, manually curated dataset derived from hotel reviews originally written in Modern Standard Arabic (MSA) and translated into Saudi and Moroccan (Darija) dialects. The dataset consists of 538 sentiment-balanced reviews spanning positive, neutral, and negative categories. Translations were validated by native speakers to ensure dialectal accuracy and sentiment preservation. This resource supports the development of dialect-aware NLP systems for real-world applications in customer experience analysis. More than 40 teams have registered for the shared task, with 12 submitting systems during the evaluation phase. The top-performing system achieved an F1 score of 0.81, demonstrating the feasibility and ongoing challenges of sentiment analysis across Arabic dialects.
