IITKGP-ABSP Submission to LRE22: Language Recognition in Low-Resource Settings
Spandan Dey, Md Sahidullah, Goutam Saha
TL;DR
The paper addresses language identification for 14 low-resource African languages under the NIST LRE22 fixed-set constraints, avoiding pre-trained models and relying on diverse augmentations and fusion to compensate for limited data. It combines multiple end-to-end TDNN-based encoders (x-vector TDNN, ECAPA-TDNN, ResNet-TDNN) with a GMM baseline, trained on augmented data and processed through energy-based VAD, MFCC/PLP features, and chunked utterances. Through augmentation-, feature-, and classifier-fusion, the approach achieves an overall EER of $11.43\%$ and a primary cost of $0.4163$ (minCprimary $0.4138$) on the LRE22-dev development set, with per-language analyses revealing varying degrees of difficulty and dialect-level confusion. This work demonstrates that robust LID is feasible in highly resource-constrained settings, highlighting the value of diverse augmentations and system fusion for practical deployment.
Abstract
This is the detailed system description of the IITKGP-ABSP lab's submission to the NIST language recognition evaluation (LRE) 2022. The objective of this LRE (LRE22) is focused on recognizing 14 low-resourced African languages. Even though NIST has provided additional training and development data, we develop our systems with additional constraints of extreme low-resource. Our primary fixed-set submission ensures the usage of only the LRE 22 development data that contains the utterances of 14 target languages. We further restrict our system from using any pre-trained models for feature extraction or classifier fine-tuning. To address the issue of low-resource, our system relies on diverse audio augmentations followed by classifier fusions. Abiding by all the constraints, the proposed methods achieve an EER of 11.43% and cost metric of 0.41 in the LRE22 development set. For users with limited computational resources or limited storage/network capabilities, the proposed system will help achieve efficient LID performance.
