Table of Contents
Fetching ...

Ethics of Generating Synthetic MRI Vocal Tract Views from the Face

Muhammad Suhaib Shahid, Gleb E. Yakubov, Andrew P. French

TL;DR

The paper investigates external-to-internal correlation modeling (E2ICM) to synthesize MRI-like vocal tract views from facial data using Pix2PixGAN, aiming to reduce reliance on costly real-time MRI. It presents a dual-modal dataset and a GAN-based translation from frontal face views to sagittal vocal-tract MRI frames, with quantitative results showing moderate image similarity (FID ≈ 30.8; SSIM ≈ 0.80) and visible articulator motion but limited spatial fidelity. The study highlights significant ethical considerations—privacy, consent, potential misuse, data storage, and biases—and argues that robust validation and governance are essential before any practical deployment. Overall, the work demonstrates feasibility as a preliminary exploration while underscoring the need for rigorous methodological and ethical safeguards in synthetic medical imaging.

Abstract

Forming oral models capable of understanding the complete dynamics of the oral cavity is vital across research areas such as speech correction, designing foods for the aging population, and dentistry. Magnetic resonance imaging (MRI) technologies, capable of capturing oral data essential for creating such detailed representations, offer a powerful tool for illustrating articulatory dynamics. However, its real-time application is hindered by expense and expertise requirements. Ever advancing generative AI approaches present themselves as a way to address this barrier by leveraging multi-modal approaches for generating pseudo-MRI views. Nonetheless, this immediately sparks ethical concerns regarding the utilisation of a technology with the capability to produce MRIs from facial observations. This paper explores the ethical implications of external-to-internal correlation modeling (E2ICM). E2ICM utilises facial movements to infer internal configurations and provides a cost-effective supporting technology for MRI. In this preliminary work, we employ Pix2PixGAN to generate pseudo-MRI views from external articulatory data, demonstrating the feasibility of this approach. Ethical considerations concerning privacy, consent, and potential misuse, which are fundamental to our examination of this innovative methodology, are discussed as a result of this experimentation.

Ethics of Generating Synthetic MRI Vocal Tract Views from the Face

TL;DR

The paper investigates external-to-internal correlation modeling (E2ICM) to synthesize MRI-like vocal tract views from facial data using Pix2PixGAN, aiming to reduce reliance on costly real-time MRI. It presents a dual-modal dataset and a GAN-based translation from frontal face views to sagittal vocal-tract MRI frames, with quantitative results showing moderate image similarity (FID ≈ 30.8; SSIM ≈ 0.80) and visible articulator motion but limited spatial fidelity. The study highlights significant ethical considerations—privacy, consent, potential misuse, data storage, and biases—and argues that robust validation and governance are essential before any practical deployment. Overall, the work demonstrates feasibility as a preliminary exploration while underscoring the need for rigorous methodological and ethical safeguards in synthetic medical imaging.

Abstract

Forming oral models capable of understanding the complete dynamics of the oral cavity is vital across research areas such as speech correction, designing foods for the aging population, and dentistry. Magnetic resonance imaging (MRI) technologies, capable of capturing oral data essential for creating such detailed representations, offer a powerful tool for illustrating articulatory dynamics. However, its real-time application is hindered by expense and expertise requirements. Ever advancing generative AI approaches present themselves as a way to address this barrier by leveraging multi-modal approaches for generating pseudo-MRI views. Nonetheless, this immediately sparks ethical concerns regarding the utilisation of a technology with the capability to produce MRIs from facial observations. This paper explores the ethical implications of external-to-internal correlation modeling (E2ICM). E2ICM utilises facial movements to infer internal configurations and provides a cost-effective supporting technology for MRI. In this preliminary work, we employ Pix2PixGAN to generate pseudo-MRI views from external articulatory data, demonstrating the feasibility of this approach. Ethical considerations concerning privacy, consent, and potential misuse, which are fundamental to our examination of this innovative methodology, are discussed as a result of this experimentation.
Paper Structure (10 sections, 1 figure)

This paper contains 10 sections, 1 figure.

Figures (1)

  • Figure 1: Still frames sample the external view (left), ground truth MRI frame (middle) and generated frame (right).