New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark
Nadège Alavoine, Gaëlle Laperriere, Christophe Servan, Sahar Ghannay, Sophie Rosset
TL;DR
The paper adds intent annotations to the MEDIA SLU benchmark to enable joint intent classification and slot filling in French. It adopts a semi-automatic tri-training workflow to generate intents for MEDIA 2022 and conducts initial SLU experiments using both cascade and end-to-end architectures with multiple French Transformer models. Key findings show the best tri-training CamemBERT-based ensemble achieving EMR 92.51 and F1 93.85, while end-to-end models offer strong joint optimization and achieve competitive CER reductions, notably around 18.3% on MEDIA 2022 full slot filling. Overall, the work broadens MEDIA for richer SLU tasks and provides insights into effective annotation strategies and architecture choices for French spoken language understanding.
Abstract
Intent classification and slot-filling are essential tasks of Spoken Language Understanding (SLU). In most SLUsystems, those tasks are realized by independent modules. For about fifteen years, models achieving both of themjointly and exploiting their mutual enhancement have been proposed. A multilingual module using a joint modelwas envisioned to create a touristic dialogue system for a European project, HumanE-AI-Net. A combination ofmultiple datasets, including the MEDIA dataset, was suggested for training this joint model. The MEDIA SLU datasetis a French dataset distributed since 2005 by ELRA, mainly used by the French research community and free foracademic research since 2020. Unfortunately, it is annotated only in slots but not intents. An enhanced version ofMEDIA annotated with intents has been built to extend its use to more tasks and use cases. This paper presents thesemi-automatic methodology used to obtain this enhanced version. In addition, we present the first results of SLUexperiments on this enhanced dataset using joint models for intent classification and slot-filling.
