SLURP: A Spoken Language Understanding Resource Package
Emanuele Bastianelli, Andrea Vanzo, Pawel Swietojanski, Verena Rieser
TL;DR
SLURP delivers a large, multi-domain SLU dataset with 72k audio examples and three-tier semantic annotations (Scenario, Action, Entities), coupled with a novel SLU-F1 metric to better capture ASR-NLU error propagation. The authors show SLURP is more linguistically and semantically diverse than prior resources and that modular (pipeline) SLU baselines currently outperform end-to-end approaches on this data. They also demonstrate that a top-down, scenario-first decoding strategy is more robust to transcription noise than bottom-up entity-first methods. The work provides extensive linguistic analyses, introduces synthetic data augmentation, and emphasizes error analysis as a practical tool for improving SLU systems, while outlining future steps toward spontaneous speech data.
Abstract
Spoken Language Understanding infers semantic meaning directly from audio data, and thus promises to reduce error propagation and misunderstandings in end-user applications. However, publicly available SLU resources are limited. In this paper, we release SLURP, a new SLU package containing the following: (1) A new challenging dataset in English spanning 18 domains, which is substantially bigger and linguistically more diverse than existing datasets; (2) Competitive baselines based on state-of-the-art NLU and ASR systems; (3) A new transparent metric for entity labelling which enables a detailed error analysis for identifying potential areas of improvement. SLURP is available at https: //github.com/pswietojanski/slurp.
