Table of Contents
Fetching ...

Digitally Supported Analysis of Spontaneous Speech (DigiSpon): Benchmarking NLP-Supported Language Sample Analysis of Swiss Children's Speech

Anja Ryser, Yingqiang Gao, Sarah Ebling

TL;DR

This study addresses the labor-intensive nature of language sample analysis (LSA) for diagnosing developmental language disorder (DLD) by proposing a non-LLM natural language processing (NLP) pipeline that preserves data privacy. Using data from 119 Swiss children and clinicians, the authors implement manual transcription and locally deployed ASR, plus POS tagging and morphological analysis, to build semi-automatic DLD feature profiles in Swiss German and Swiss Standard German. They assess zero-shot capabilities of ASR and POS tagging without commercial LLMs, reporting high inter-annotator agreement and reasonable automatic tagging performance, with normalization improving transcription quality. The work demonstrates the feasibility and ethical viability of integrating locally deployed NLP tools into LSA workflows and outlines a clear path for expanding datasets, dialect-aware models, and deeper syntactic analyses for semi-automatic DLD diagnosis in Switzerland.

Abstract

Language sample analysis (LSA) is a process that complements standardized psychometric tests for diagnosing, for example, developmental language disorder (DLD) in children. However, its labor-intensive nature has limited its use in speech-language pathology practice. We introduce an approach that leverages natural language processing (NLP) methods not based on commercial large language models (LLMs) applied to transcribed speech data from 119 children in the German speaking part of Switzerland with typical and atypical language development. The study aims to identify optimal practices that support speech-language pathologists in diagnosing DLD more efficiently within a human-in-the-loop framework, without relying on potentially unethical implementations that leverage commercial LLMs. Preliminary findings underscore the potential of integrating locally deployed NLP methods into the process of semi-automatic LSA.

Digitally Supported Analysis of Spontaneous Speech (DigiSpon): Benchmarking NLP-Supported Language Sample Analysis of Swiss Children's Speech

TL;DR

This study addresses the labor-intensive nature of language sample analysis (LSA) for diagnosing developmental language disorder (DLD) by proposing a non-LLM natural language processing (NLP) pipeline that preserves data privacy. Using data from 119 Swiss children and clinicians, the authors implement manual transcription and locally deployed ASR, plus POS tagging and morphological analysis, to build semi-automatic DLD feature profiles in Swiss German and Swiss Standard German. They assess zero-shot capabilities of ASR and POS tagging without commercial LLMs, reporting high inter-annotator agreement and reasonable automatic tagging performance, with normalization improving transcription quality. The work demonstrates the feasibility and ethical viability of integrating locally deployed NLP tools into LSA workflows and outlines a clear path for expanding datasets, dialect-aware models, and deeper syntactic analyses for semi-automatic DLD diagnosis in Switzerland.

Abstract

Language sample analysis (LSA) is a process that complements standardized psychometric tests for diagnosing, for example, developmental language disorder (DLD) in children. However, its labor-intensive nature has limited its use in speech-language pathology practice. We introduce an approach that leverages natural language processing (NLP) methods not based on commercial large language models (LLMs) applied to transcribed speech data from 119 children in the German speaking part of Switzerland with typical and atypical language development. The study aims to identify optimal practices that support speech-language pathologists in diagnosing DLD more efficiently within a human-in-the-loop framework, without relying on potentially unethical implementations that leverage commercial LLMs. Preliminary findings underscore the potential of integrating locally deployed NLP methods into the process of semi-automatic LSA.

Paper Structure

This paper contains 45 sections, 5 figures, 20 tables.

Figures (5)

  • Figure 1: Our pipeline of LSA with non-LLM-based NLP approaches for diagnosis of DLD. (a) Spontaneous speech recording: a speech-language pathologists interacts with the child in a naturalistic setting and both of their speeches are recorded; (b) Speech-to-text transcription: the recordings are (automatically) converted into text and post-corrected by the speech-language pathologists and further (automatically) tokenized to words; (c) DLD feature profiling and measuring: approaches such as POS tagging, dependency parsing, stemming and lemmatization, etc., are applied to create the DLD feature profiles, where various linguistic measures are computed to evaluate the language abilities of children. The final diagnostic decision is made by the human speech-language pathologist, taking into account the output of the pipeline as well as other criteria. The speech utterances demonstrated are in Swiss German.
  • Figure 2: Average ASR results on Whisper transcriptions with standard deviations.
  • Figure 3: POS tagging results with BERT-based POS tagging model and spaCy model, measured on all transcription data for Swiss German and Swiss Standard German.
  • Figure 4: Editor of the software. After automatically transcribing the recordings, the transcript is opened in the editor. Here, the recordings can be played and the transcript can be corrected.
  • Figure 5: Analysis: after correcting the automatic transcription manually, the analysis can be started. Providing different options for the analysis, such as which speakers should be analyzed and what should be filtered out, a personalized analysis can be executed, containing values such as mean-length of utterance, distribution of part of speech tags and subject verb agreement, as well as additional files with overviews over all used verbs and pos tagging.