Table of Contents
Fetching ...

Lamer-SSL: Layer-aware Mixture of LoRA Experts for Continual Multilingual Expansion of Self-supervised Models without Forgetting

Jing Xu, Minglin Wu, Xueyuan Chen, Xixin Wu, Helen Meng

TL;DR

Lamer-SSL is proposed, a parameter-efficient framework that integrates a Layer-Aware MixturE of LoRA Experts (Lamer) module with a replay strategy that retains prior knowledge using minimal data, mitigating forgetting during continual training.

Abstract

Despite their impressive performance, self-supervised speech models often struggle to generalize to new languages and tend to forget previously acquired knowledge during continual training. To address this, we propose Lamer-SSL, a parameter-efficient framework that integrates a Layer-Aware MixturE of LoRA Experts (Lamer) module with a replay strategy. The Lamer module enables flexible balancing between shared and language-specific representations, while layer-aware expert allocation assigns more experts to deeper layers where semantic information is richer. Meanwhile, the replay strategy retains prior knowledge using minimal data, mitigating forgetting during continual training. Experiments on automatic speech recognition (ASR) and language identification (LID) demonstrate that Lamer-SSL extends self-supervised models to new languages effectively while maintaining strong performance on previously learned languages with only 2.14% parameters being trainable.

Lamer-SSL: Layer-aware Mixture of LoRA Experts for Continual Multilingual Expansion of Self-supervised Models without Forgetting

TL;DR

Lamer-SSL is proposed, a parameter-efficient framework that integrates a Layer-Aware MixturE of LoRA Experts (Lamer) module with a replay strategy that retains prior knowledge using minimal data, mitigating forgetting during continual training.

Abstract

Despite their impressive performance, self-supervised speech models often struggle to generalize to new languages and tend to forget previously acquired knowledge during continual training. To address this, we propose Lamer-SSL, a parameter-efficient framework that integrates a Layer-Aware MixturE of LoRA Experts (Lamer) module with a replay strategy. The Lamer module enables flexible balancing between shared and language-specific representations, while layer-aware expert allocation assigns more experts to deeper layers where semantic information is richer. Meanwhile, the replay strategy retains prior knowledge using minimal data, mitigating forgetting during continual training. Experiments on automatic speech recognition (ASR) and language identification (LID) demonstrate that Lamer-SSL extends self-supervised models to new languages effectively while maintaining strong performance on previously learned languages with only 2.14% parameters being trainable.
Paper Structure (18 sections, 8 equations, 2 figures, 3 tables)

This paper contains 18 sections, 8 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Overview of Lamer-SSL. (a) Architecture of HuBERT-based SSL models. (b) Transformer block with Lamer module. (c) Architecture of Lamer module, the router selects the Top-K experts based on the input. Only the LoRA experts and the router are trainable during training.
  • Figure 2: Expert activation weights across languages at four layers.