Exploring Energy-Based Models for Out-of-Distribution Detection in Dialect Identification
Yaqian Hao, Chenguang Hu, Yingying Gao, Shilei Zhang, Junlan Feng
TL;DR
This work tackles out-of-distribution detection in dialect identification, a challenging setting where unseen dialects can degrade performance. It introduces MEJEM, a margin-enhanced joint energy model that fuses a discriminative classifier, a generative energy model, and an energy-based margin loss, optimized via the joint objective $L(\theta) = \log p_\theta(y|x) + \lambda_1 \log p_\theta(x) + \lambda_2 L_{e}$, and trained with Sharpness-Aware Minimization. The study demonstrates that energy-based OOD scores outperform softmax scores, supported by analyses of AUROC and $\text{FPR}_{95}$ across dialect datasets, and shows that combining the generative component with the margin loss yields the strongest OOD performance; SGLD sampling aids density estimation. Overall, MEJEM advances robust dialect OOD detection with EBMs, offering practical improvements for real-world speech systems and motivating broader application of energy-based approaches in language and dialect tasks.
Abstract
The diverse nature of dialects presents challenges for models trained on specific linguistic patterns, rendering them susceptible to errors when confronted with unseen or out-of-distribution (OOD) data. This study introduces a novel margin-enhanced joint energy model (MEJEM) tailored specifically for OOD detection in dialects. By integrating a generative model and the energy margin loss, our approach aims to enhance the robustness of dialect identification systems. Furthermore, we explore two OOD scores for OOD dialect detection, and our findings conclusively demonstrate that the energy score outperforms the softmax score. Leveraging Sharpness-Aware Minimization to optimize the training process of the joint model, we enhance model generalization by minimizing both loss and sharpness. Experiments conducted on dialect identification tasks validate the efficacy of Energy-Based Models and provide valuable insights into their performance.
