Table of Contents
Fetching ...

MORA: AI-Mediated Story-Based practice for Speech Sound Disorder from Clinic to Home

Sumin Hong, Xavier Briggs, Qingxiao Zheng, Yao Du, Jinjun Xiong, Toby Jia-jun Li

TL;DR

MORA addresses the need for scalable, engaging home-based SSD practice by embedding target sounds in character-driven interactive stories and coupling them with an AI-in-the-loop workflow for SLP configuration and asynchronous monitoring. The system integrates multisensory cues, feedback mechanisms, and a data-driven dashboard to bridge clinic and home while supporting generalization across linguistic levels. Expert evaluations with licensed SLPs demonstrate alignment with evidence-based SSD interventions and reveal opportunities for literacy and social-language benefits, alongside considerations of privacy and digital literacy. Overall, MORA exemplifies how AI-assisted, expert-in-the-loop design can extend therapeutic reach, reduce clinician workload, and preserve professional autonomy in pediatric speech-language therapy.

Abstract

Speech sound disorder is among the most common communication challenges in preschool children. Home-based practice is essential for effective therapy and for acquiring generalization of target sounds, yet sustaining engaging and consistent practice remains difficult. Existing story-based activities, despite their potential for sound generalization and educational benefits, are often underutilized due to limited interactivity. Moreover, many practice tools fail to sufficiently integrate speech-language pathologists into the process, resulting in weak alignment with clinical treatment plans. To address these limitations, we present MORA, an interactive story-based practice system. MORA introduces three key innovations. First, it embeds target sounds and vocabulary into dynamic, character-driven conversational narratives, requiring children to actively produce speech to progress the story, thereby creating natural opportunities for exposure, repetition, and generalization. Second, it provides visual cues, explicit instruction, and feedback, allowing children to practice effectively either independently or with caregivers. Third, it supports an AI-in-the-loop workflow, enabling SLPs to configure target materials, review logged speech with phoneme-level scoring, and adapt therapy plans asynchronously -- bridging the gap between clinic and home practice while respecting professional expertise. A formative study with six licensed SLPs informed the system's design rationale, and an expert review with seven SLPs demonstrated strong alignment with established articulation-based treatments, as well as potential to enhance children's engagement and literacy. Furthermore, discussions highlight the design considerations for professional support and configurability, adaptive and multimodal child interaction, while highlighting MORA's broader applicability across speech and language disorders.

MORA: AI-Mediated Story-Based practice for Speech Sound Disorder from Clinic to Home

TL;DR

MORA addresses the need for scalable, engaging home-based SSD practice by embedding target sounds in character-driven interactive stories and coupling them with an AI-in-the-loop workflow for SLP configuration and asynchronous monitoring. The system integrates multisensory cues, feedback mechanisms, and a data-driven dashboard to bridge clinic and home while supporting generalization across linguistic levels. Expert evaluations with licensed SLPs demonstrate alignment with evidence-based SSD interventions and reveal opportunities for literacy and social-language benefits, alongside considerations of privacy and digital literacy. Overall, MORA exemplifies how AI-assisted, expert-in-the-loop design can extend therapeutic reach, reduce clinician workload, and preserve professional autonomy in pediatric speech-language therapy.

Abstract

Speech sound disorder is among the most common communication challenges in preschool children. Home-based practice is essential for effective therapy and for acquiring generalization of target sounds, yet sustaining engaging and consistent practice remains difficult. Existing story-based activities, despite their potential for sound generalization and educational benefits, are often underutilized due to limited interactivity. Moreover, many practice tools fail to sufficiently integrate speech-language pathologists into the process, resulting in weak alignment with clinical treatment plans. To address these limitations, we present MORA, an interactive story-based practice system. MORA introduces three key innovations. First, it embeds target sounds and vocabulary into dynamic, character-driven conversational narratives, requiring children to actively produce speech to progress the story, thereby creating natural opportunities for exposure, repetition, and generalization. Second, it provides visual cues, explicit instruction, and feedback, allowing children to practice effectively either independently or with caregivers. Third, it supports an AI-in-the-loop workflow, enabling SLPs to configure target materials, review logged speech with phoneme-level scoring, and adapt therapy plans asynchronously -- bridging the gap between clinic and home practice while respecting professional expertise. A formative study with six licensed SLPs informed the system's design rationale, and an expert review with seven SLPs demonstrated strong alignment with established articulation-based treatments, as well as potential to enhance children's engagement and literacy. Furthermore, discussions highlight the design considerations for professional support and configurability, adaptive and multimodal child interaction, while highlighting MORA's broader applicability across speech and language disorders.

Paper Structure

This paper contains 43 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Main screen of MORA. (a:1) is the character's dialogue, which asks questions, has together, describes situations, and suggests interaction. Target words are highlighted as clickable green buttons that activate the information in both sidebars. (a:2) Target Words sidebar contains all the words in the story, highlighting the words in the current dialogue. (a: 2-1) Each word card has repeated listening and pronunciation practice. (a:3) Mouth shape sidebar consists of a corresponding pictured mouth, how to make the sound, and a hand gesture tip for each sound, with voice support. (a:3-1) The way of making the mouth shape and placing the tongue is supported, and (a:3-2) how to make the phoneme hand gesture is described. (b:4-1) Instruction for children in sentence-mode. (b:4-2) Madlib activity at a sentence of children's side in word-mode. If the child finishes the sentence, the sentence is spoken. (b:5) If the child misses the word, feedback is delivered.
  • Figure 2: Additional features in MORA. (a:6) Children can choose the destination, which affects the narrative. Additional clickable interactions are inherited in the story. (a:7) Short instruction where parents can support the practice. (b:8) The prompt boxes lead production in the interaction, and supplemental practice in the Target Word sidebar. (b:8-1) The word is suggested in word-mode, and (b:8-2) the sentence in sentence-mode.
  • Figure 3: The screen of dashboard in MORA. (9) The number of pronunciations per target word. (10) Circle graph of aggregated production quality. (11) Each card contains the following information: (11-1) production quality label; (11-2) phonemic transcription for diagnosis by SLPs; (11-3) replayable audio clip.
  • Figure 4: Boxplot of three phonemic transcription models, Gemini 2.0-Pro, XLSR-TIMIT-B0, and Ginic-gender-split on human annotated phonetic transcription (HuTr) and model predictions.
  • Figure 5: Configuration of Stories in Expert Study
  • ...and 1 more figures