Table of Contents
Fetching ...

Fotheidil: an Automatic Transcription System for the Irish Language

Liam Lonergan, Ibon Saratxaga, John Sloan, Oscar Maharog, Mengjie Qian, Neasa Ní Chiaráin, Christer Gobl, Ailbhe Ní Chasaide

TL;DR

Fotheidil tackles the resource-scarcity problem in Irish language ASR by delivering a freely available, web-based transcription platform that integrates SSL-enhanced modular ASR, dialect-aware diarisation, and a novel sequence-to-sequence C&PR module. The work demonstrates substantial improvements in speech recognition across in-domain and out-of-domain Irish dialects, with particularly large gains for underrepresented Ulster speakers, and shows that a transformer-based C&PR system can produce rich, readable transcripts from raw ASR outputs. Together, these contributions offer a practical, community-driven pipeline that both lowers barriers to Irish transcription and provides a scalable path for incremental improvements through user-generated corrections. The system's CPU-friendly architecture and public availability position Fotheidil as a valuable resource for researchers, language communities, and digital platforms requiring Irish transcription and text normalization.

Abstract

This paper sets out the first web-based transcription system for the Irish language - Fotheidil, a system that utilises speech-related AI technologies as part of the ABAIR initiative. The system includes both off-the-shelf pre-trained voice activity detection and speaker diarisation models and models trained specifically for Irish automatic speech recognition and capitalisation and punctuation restoration. Semi-supervised learning is explored to improve the acoustic model of a modular TDNN-HMM ASR system, yielding substantial improvements for out-of-domain test sets and dialects that are underrepresented in the supervised training set. A novel approach to capitalisation and punctuation restoration involving sequence-to-sequence models is compared with the conventional approach using a classification model. Experimental results show here also substantial improvements in performance. The system will be made freely available for public use, and represents an important resource to researchers and others who transcribe Irish language materials. Human-corrected transcriptions will be collected and included in the training dataset as the system is used, which should lead to incremental improvements to the ASR model in a cyclical, community-driven fashion.

Fotheidil: an Automatic Transcription System for the Irish Language

TL;DR

Fotheidil tackles the resource-scarcity problem in Irish language ASR by delivering a freely available, web-based transcription platform that integrates SSL-enhanced modular ASR, dialect-aware diarisation, and a novel sequence-to-sequence C&PR module. The work demonstrates substantial improvements in speech recognition across in-domain and out-of-domain Irish dialects, with particularly large gains for underrepresented Ulster speakers, and shows that a transformer-based C&PR system can produce rich, readable transcripts from raw ASR outputs. Together, these contributions offer a practical, community-driven pipeline that both lowers barriers to Irish transcription and provides a scalable path for incremental improvements through user-generated corrections. The system's CPU-friendly architecture and public availability position Fotheidil as a valuable resource for researchers, language communities, and digital platforms requiring Irish transcription and text normalization.

Abstract

This paper sets out the first web-based transcription system for the Irish language - Fotheidil, a system that utilises speech-related AI technologies as part of the ABAIR initiative. The system includes both off-the-shelf pre-trained voice activity detection and speaker diarisation models and models trained specifically for Irish automatic speech recognition and capitalisation and punctuation restoration. Semi-supervised learning is explored to improve the acoustic model of a modular TDNN-HMM ASR system, yielding substantial improvements for out-of-domain test sets and dialects that are underrepresented in the supervised training set. A novel approach to capitalisation and punctuation restoration involving sequence-to-sequence models is compared with the conventional approach using a classification model. Experimental results show here also substantial improvements in performance. The system will be made freely available for public use, and represents an important resource to researchers and others who transcribe Irish language materials. Human-corrected transcriptions will be collected and included in the training dataset as the system is used, which should lead to incremental improvements to the ASR model in a cyclical, community-driven fashion.
Paper Structure (22 sections, 2 figures, 11 tables)