Table of Contents
Fetching ...

AURORA Model of Formant-to-Tongue Inversion for Didactic and Clinical Applications

Patrycja Strycharczuk, Sam Kirkham

Abstract

This paper outlines the conceptual and computational foundations of the AURORA (Acoustic Understanding and Real-time Observation of Resonant Articulations) model. AURORA predicts tongue displacement and shape in vowel sounds based on the first two formant values. It is intended as a didactic aid helping to explain the relationship between formants and the underlying articulation, as well as a foundation for biofeedback applications. The model is informed by ultrasound tongue imaging and acoustic data from 40 native speakers of English. In this paper we discuss the motivation for the model, the modelling objectives as well as the model architecture. We provide a qualitative evaluation of the model, focusing on selected tongue features. We then present two tools developed to make the model more accessible to a wider audience, a Shiny app and a prototype software for real-time tongue biofeedback. Potential users include students of phonetics, linguists in fields adjacent to phonetics, as well as speech and language therapy practitioners and clients.

AURORA Model of Formant-to-Tongue Inversion for Didactic and Clinical Applications

Abstract

This paper outlines the conceptual and computational foundations of the AURORA (Acoustic Understanding and Real-time Observation of Resonant Articulations) model. AURORA predicts tongue displacement and shape in vowel sounds based on the first two formant values. It is intended as a didactic aid helping to explain the relationship between formants and the underlying articulation, as well as a foundation for biofeedback applications. The model is informed by ultrasound tongue imaging and acoustic data from 40 native speakers of English. In this paper we discuss the motivation for the model, the modelling objectives as well as the model architecture. We provide a qualitative evaluation of the model, focusing on selected tongue features. We then present two tools developed to make the model more accessible to a wider audience, a Shiny app and a prototype software for real-time tongue biofeedback. Potential users include students of phonetics, linguists in fields adjacent to phonetics, as well as speech and language therapy practitioners and clients.
Paper Structure (10 sections, 1 equation, 5 figures)

This paper contains 10 sections, 1 equation, 5 figures.

Figures (5)

  • Figure 1: Stages of tongue contour reconstruction from the output of multivariate regression
  • Figure 2: Tongue model predictions. Input $F_1$ and $F_2$ values are indicated on the strip labels. Tongue tip is on the left.
  • Figure 3: Comparison of tongue model predictions with mean tongue shapes for each item in the input data, based on the mean formant values. Tongue tip is on the left.
  • Figure 4: Screenshot of the Shiny app, showing the controls and the predicted output
  • Figure 5: Screenshots of the acoustic/articulatory biofeedback app showing a production of /i/ (top) and /A/ (bottom). The left panel shows formant tracking based on the input spectrum and the middle panel shows a predicted tongue contour. The right panel shows controls, which can be hidden by the user.