Table of Contents
Fetching ...

WolBanking77: Wolof Banking Speech Intent Classification Dataset

Abdou Karim Kandji, Frédéric Precioso, Cheikh Ba, Samba Ndiaye, Augustin Ndione

TL;DR

WolBanking77 addresses the gap in low-resource language NLP/SLU by delivering Wolof-focused text and speech datasets for banking intents, enabling access to digital services for illiterate populations. The study translates Banking77 into Wolof and French, pairs the text data with a MINDS-14-inspired audio corpus, and benchmarks a range of ID and ASR models, including zero-shot, few-shot, and fine-tuned setups with multilingual pretrained backbones. Results show strong potential for AfroXLMR in few-shot and fine-tuning scenarios, while ASR performance is competitive with Canary-Flash and Phi-4-multimodal-instruct; nonetheless, Wolof ID remains challenging for current small-language models. The dataset, its documentation, and code are released under CC BY 4.0 to spur broader research in Wolof NLP and SLU, supporting inclusive digital access in West Africa.

Abstract

Intent classification models have made a significant progress in recent years. However, previous studies primarily focus on high-resource language datasets, which results in a gap for low-resource languages and for regions with high rates of illiteracy, where languages are more spoken than read or written. This is the case in Senegal, for example, where Wolof is spoken by around 90\% of the population, while the national illiteracy rate remains at of 42\%. Wolof is actually spoken by more than 10 million people in West African region. To address these limitations, we introduce the Wolof Banking Speech Intent Classification Dataset (WolBanking77), for academic research in intent classification. WolBanking77 currently contains 9,791 text sentences in the banking domain and more than 4 hours of spoken sentences. Experiments on various baselines are conducted in this work, including text and voice state-of-the-art models. The results are very promising on this current dataset. In addition, this paper presents an in-depth examination of the dataset's contents. We report baseline F1-scores and word error rates metrics respectively on NLP and ASR models trained on WolBanking77 dataset and also comparisons between models. Dataset and code available at: https://github.com/abdoukarim/wolbanking77.

WolBanking77: Wolof Banking Speech Intent Classification Dataset

TL;DR

WolBanking77 addresses the gap in low-resource language NLP/SLU by delivering Wolof-focused text and speech datasets for banking intents, enabling access to digital services for illiterate populations. The study translates Banking77 into Wolof and French, pairs the text data with a MINDS-14-inspired audio corpus, and benchmarks a range of ID and ASR models, including zero-shot, few-shot, and fine-tuned setups with multilingual pretrained backbones. Results show strong potential for AfroXLMR in few-shot and fine-tuning scenarios, while ASR performance is competitive with Canary-Flash and Phi-4-multimodal-instruct; nonetheless, Wolof ID remains challenging for current small-language models. The dataset, its documentation, and code are released under CC BY 4.0 to spur broader research in Wolof NLP and SLU, supporting inclusive digital access in West Africa.

Abstract

Intent classification models have made a significant progress in recent years. However, previous studies primarily focus on high-resource language datasets, which results in a gap for low-resource languages and for regions with high rates of illiteracy, where languages are more spoken than read or written. This is the case in Senegal, for example, where Wolof is spoken by around 90\% of the population, while the national illiteracy rate remains at of 42\%. Wolof is actually spoken by more than 10 million people in West African region. To address these limitations, we introduce the Wolof Banking Speech Intent Classification Dataset (WolBanking77), for academic research in intent classification. WolBanking77 currently contains 9,791 text sentences in the banking domain and more than 4 hours of spoken sentences. Experiments on various baselines are conducted in this work, including text and voice state-of-the-art models. The results are very promising on this current dataset. In addition, this paper presents an in-depth examination of the dataset's contents. We report baseline F1-scores and word error rates metrics respectively on NLP and ASR models trained on WolBanking77 dataset and also comparisons between models. Dataset and code available at: https://github.com/abdoukarim/wolbanking77.

Paper Structure

This paper contains 42 sections, 7 figures, 10 tables.

Figures (7)

  • Figure 1: A query translated in French and Wolof
  • Figure 2: Top 5 most frequent words: kont (account), xaalis (money), bëggoona (I wanted to), jëflante (operation), xam (know).
  • Figure 3: Senegal region repartition by speaker
  • Figure 4: Zero-shot, few-shot and fine-tuning (FT) results (in % for F1-score) of multilingual pre-trained models on WolBanking77.
  • Figure 5: Zero-shot and fine-tuning (FT) results (in %) on multilingual pre-trained models.
  • ...and 2 more figures