Speak & Improve Challenge 2025: Tasks and Baseline Systems

Mengjie Qian; Kate Knill; Stefano Banno; Siyuan Tang; Penny Karanasou; Mark J. F. Gales; Diane Nicholls

Speak & Improve Challenge 2025: Tasks and Baseline Systems

Mengjie Qian, Kate Knill, Stefano Banno, Siyuan Tang, Penny Karanasou, Mark J. F. Gales, Diane Nicholls

TL;DR

The paper introduces the Speak & Improve Challenge 2025 and the Speak & Improve Corpus 2025, a diverse, richly annotated L2 English dataset designed to advance holistic spoken language assessment and feedback. It defines four shared tasks—ASR, SLA, SGEC, and SGECF—with closed and open tracks, and provides baseline systems to standardize comparisons: an ASR baseline based on Whisper, a cascaded SLA with BERT-based scoring, a cascaded SGEC using DD and GEC components, and an SGECF variant that evaluates correction feedback. The data resources include the S&I Corpus 2025 (approximately 340 hours, CEFR A2–C1) plus external sources like Switchboard Reannotated and BEA-2019 for disfluency detection and grammatical correction, respectively. The work sets up evaluation tools and benchmarks (e.g., SpWER, RMSE, PCC, SRC, WER, TER, M2, and ERRANT F0.5) to spur progress in robust, inclusive language-learning technologies with real-world applicability.

Abstract

This paper presents the "Speak & Improve Challenge 2025: Spoken Language Assessment and Feedback" -- a challenge associated with the ISCA SLaTE 2025 Workshop. The goal of the challenge is to advance research on spoken language assessment and feedback, with tasks associated with both the underlying technology and language learning feedback. Linked with the challenge, the Speak & Improve (S&I) Corpus 2025 is being pre-released, a dataset of L2 learner English data with holistic scores and language error annotation, collected from open (spontaneous) speaking tests on the Speak & Improve learning platform. The corpus consists of approximately 315 hours of audio data from second language English learners with holistic scores, and a 55-hour subset with manual transcriptions and error labels. The Challenge has four shared tasks: Automatic Speech Recognition (ASR), Spoken Language Assessment (SLA), Spoken Grammatical Error Correction (SGEC), and Spoken Grammatical Error Correction Feedback (SGECF). Each of these tasks has a closed track where a predetermined set of models and data sources are allowed to be used, and an open track where any public resource may be used. Challenge participants may do one or more of the tasks. This paper describes the challenge, the S&I Corpus 2025, and the baseline systems released for the Challenge.

Speak & Improve Challenge 2025: Tasks and Baseline Systems

TL;DR

Abstract

Speak & Improve Challenge 2025: Tasks and Baseline Systems

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)