Table of Contents
Fetching ...

Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Chihiro Taguchi, Yukinori Takubo, David Chiang

Abstract

Language endangerment poses a major challenge to linguistic diversity worldwide, and technological advances have opened new avenues for documentation and revitalization. Among these, automatic speech recognition (ASR) has shown increasing potential to assist in the transcription of endangered language data. This study focuses on Ikema, a severely endangered Ryukyuan language spoken in Okinawa, Japan, with approximately 1,300 remaining speakers, most of whom are over 60 years old. We present an ongoing effort to develop an ASR system for Ikema based on field recordings. Specifically, we (1) construct a {\totaldatasethours}-hour speech corpus from field recordings, (2) train an ASR model that achieves a character error rate as low as 15\%, and (3) evaluate the impact of ASR assistance on the efficiency of speech transcription. Our results demonstrate that ASR integration can substantially reduce transcription time and cognitive load, offering a practical pathway toward scalable, technology-supported documentation of endangered languages.

Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Abstract

Language endangerment poses a major challenge to linguistic diversity worldwide, and technological advances have opened new avenues for documentation and revitalization. Among these, automatic speech recognition (ASR) has shown increasing potential to assist in the transcription of endangered language data. This study focuses on Ikema, a severely endangered Ryukyuan language spoken in Okinawa, Japan, with approximately 1,300 remaining speakers, most of whom are over 60 years old. We present an ongoing effort to develop an ASR system for Ikema based on field recordings. Specifically, we (1) construct a {\totaldatasethours}-hour speech corpus from field recordings, (2) train an ASR model that achieves a character error rate as low as 15\%, and (3) evaluate the impact of ASR assistance on the efficiency of speech transcription. Our results demonstrate that ASR integration can substantially reduce transcription time and cognitive load, offering a practical pathway toward scalable, technology-supported documentation of endangered languages.

Paper Structure

This paper contains 16 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: A classification of Japonic languages and Ikema's position thereof. The classification is largely based on shimoji-2008-grammar-irabu and pellard-2015-linguistic.
  • Figure 2: Map of the Miyako Islands (bottom) in Japan (top). The Ikema-speaking villages are marked with a red circle.
  • Figure 3: Segmentation and annotation in ELAN. The top transcription tier contains pause-based segments, while the bottom tier contains longer segments that concatenate multiple pause-based segments to approximate sentence-level units.
  • Figure 4: The curves of the CERs on the validation data during the training.