Table of Contents
Fetching ...

SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages

Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Meriem Beloucif, Christine De Kock, Oumaima Hourrane, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Krishnapriya Vishnubhotla, Seid Muhie Yimam, Saif M. Mohammad

TL;DR

SemEval-2024 Task 1 presents the first shared task on semantic textual relatedness (STR) across 14 languages from Africa and Asia, addressing broader text relations beyond paraphrase. It curates monolingual STR datasets using Best–Worst Scaling and evaluates systems in supervised, unsupervised, and cross-lingual tracks, using Spearman correlation against human judgments. Results show strong language- and track-dependent variation, with data augmentation, adapters, ensembling, and cross-lingual transfer among effective strategies, while simple baselines remain competitive in some settings. The work provides valuable resources and analyses to advance multilingual sentence representations for low-resource languages and informs future STR research and downstream NLP tasks.

Abstract

We present the first shared task on Semantic Textual Relatedness (STR). While earlier shared tasks primarily focused on semantic similarity, we instead investigate the broader phenomenon of semantic relatedness across 14 languages: Afrikaans, Algerian Arabic, Amharic, English, Hausa, Hindi, Indonesian, Kinyarwanda, Marathi, Moroccan Arabic, Modern Standard Arabic, Punjabi, Spanish, and Telugu. These languages originate from five distinct language families and are predominantly spoken in Africa and Asia -- regions characterised by the relatively limited availability of NLP resources. Each instance in the datasets is a sentence pair associated with a score that represents the degree of semantic textual relatedness between the two sentences. Participating systems were asked to rank sentence pairs by their closeness in meaning (i.e., their degree of semantic relatedness) in the 14 languages in three main tracks: (a) supervised, (b) unsupervised, and (c) crosslingual. The task attracted 163 participants. We received 70 submissions in total (across all tasks) from 51 different teams, and 38 system description papers. We report on the best-performing systems as well as the most common and the most effective approaches for the three different tracks.

SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages

TL;DR

SemEval-2024 Task 1 presents the first shared task on semantic textual relatedness (STR) across 14 languages from Africa and Asia, addressing broader text relations beyond paraphrase. It curates monolingual STR datasets using Best–Worst Scaling and evaluates systems in supervised, unsupervised, and cross-lingual tracks, using Spearman correlation against human judgments. Results show strong language- and track-dependent variation, with data augmentation, adapters, ensembling, and cross-lingual transfer among effective strategies, while simple baselines remain competitive in some settings. The work provides valuable resources and analyses to advance multilingual sentence representations for low-resource languages and informs future STR research and downstream NLP tasks.

Abstract

We present the first shared task on Semantic Textual Relatedness (STR). While earlier shared tasks primarily focused on semantic similarity, we instead investigate the broader phenomenon of semantic relatedness across 14 languages: Afrikaans, Algerian Arabic, Amharic, English, Hausa, Hindi, Indonesian, Kinyarwanda, Marathi, Moroccan Arabic, Modern Standard Arabic, Punjabi, Spanish, and Telugu. These languages originate from five distinct language families and are predominantly spoken in Africa and Asia -- regions characterised by the relatively limited availability of NLP resources. Each instance in the datasets is a sentence pair associated with a score that represents the degree of semantic textual relatedness between the two sentences. Participating systems were asked to rank sentence pairs by their closeness in meaning (i.e., their degree of semantic relatedness) in the 14 languages in three main tracks: (a) supervised, (b) unsupervised, and (c) crosslingual. The task attracted 163 participants. We received 70 submissions in total (across all tasks) from 51 different teams, and 38 system description papers. We report on the best-performing systems as well as the most common and the most effective approaches for the three different tracks.
Paper Structure (71 sections, 1 equation, 7 tables)