Table of Contents
Fetching ...

Noise-Robustness Through Noise: A Framework combining Asymmetric LoRA with Poisoning MoE

Zhaokun Wang, Jinyu Guo, Jingwen Pu, Lingfeng Chen, Hongli Pu, Jie Ou, Libo Qin, Wenhong Tian

TL;DR

Drawing inspiration from the mixture-of-experts architecture, LoPE strategically integrates a dedicated poisoning expert in an asymmetric LoRA configuration and performs noise injection on the poisoning expert during fine-tuning to enhance its noise discrimination and processing ability.

Abstract

Current parameter-efficient fine-tuning methods for adapting pre-trained language models to downstream tasks are susceptible to interference from noisy data. Conventional noise-handling approaches either rely on laborious data pre-processing or employ model architecture modifications prone to error accumulation. In contrast to existing noise-process paradigms, we propose a noise-robust adaptation method via asymmetric LoRA poisoning experts (LoPE), a novel framework that enhances model robustness to noise only with generated noisy data. Drawing inspiration from the mixture-of-experts architecture, LoPE strategically integrates a dedicated poisoning expert in an asymmetric LoRA configuration. Through a two-stage paradigm, LoPE performs noise injection on the poisoning expert during fine-tuning to enhance its noise discrimination and processing ability. During inference, we selectively mask the dedicated poisoning expert to leverage purified knowledge acquired by normal experts for noise-robust output. Extensive experiments demonstrate that LoPE achieves strong performance and robustness purely through the low-cost noise injection, which completely eliminates the requirement of data cleaning.

Noise-Robustness Through Noise: A Framework combining Asymmetric LoRA with Poisoning MoE

TL;DR

Drawing inspiration from the mixture-of-experts architecture, LoPE strategically integrates a dedicated poisoning expert in an asymmetric LoRA configuration and performs noise injection on the poisoning expert during fine-tuning to enhance its noise discrimination and processing ability.

Abstract

Current parameter-efficient fine-tuning methods for adapting pre-trained language models to downstream tasks are susceptible to interference from noisy data. Conventional noise-handling approaches either rely on laborious data pre-processing or employ model architecture modifications prone to error accumulation. In contrast to existing noise-process paradigms, we propose a noise-robust adaptation method via asymmetric LoRA poisoning experts (LoPE), a novel framework that enhances model robustness to noise only with generated noisy data. Drawing inspiration from the mixture-of-experts architecture, LoPE strategically integrates a dedicated poisoning expert in an asymmetric LoRA configuration. Through a two-stage paradigm, LoPE performs noise injection on the poisoning expert during fine-tuning to enhance its noise discrimination and processing ability. During inference, we selectively mask the dedicated poisoning expert to leverage purified knowledge acquired by normal experts for noise-robust output. Extensive experiments demonstrate that LoPE achieves strong performance and robustness purely through the low-cost noise injection, which completely eliminates the requirement of data cleaning.

Paper Structure

This paper contains 39 sections, 4 equations, 6 figures, 15 tables.

Figures (6)

  • Figure 1: The unique configuration of the asymmetric LoRA architecture in our approach, where the grey area on the right represents the mask for the poisoning expert, thereby eliminating the knowledge affected by the noise learned by the poisoning expert.
  • Figure 2: The LoPE pipeline consists of two stages: fine-tuning and inference stages. Fine-tuning Stage I: HyNoIse is to enhance the poisoning expert's noise understanding. Fine-tuning Stage II: Freezing the poisoning expert while fine-tuning the remaining experts and the shared matrix. Inference stage: Masking the poisoning expert with noise-affect knowledge, allowing the remaining experts to generate outputs that are relatively robust to noise.
  • Figure 3: The relationship between the number of experts involved in the DyCompEnSate method and the average accuracy (%) of datasets PIQA and SIQA.
  • Figure 4: Average accuracy to different HyNoIse ratios in PIQA and SIQA datasets.
  • Figure 5: Comparison of the average parameter changes of matrices $A$ and $B$ during first-stage fine-tuning, with and without HyNoIse. Specifically, $A$ and $B$ correspond to the matrix $A$ and poisoning expert $B_D$ trained without HyNoIse, while $A^\prime$ and $B^\prime$ represent the respective matrix $A$ and poisoning expert $B_D$ trained with HyNoIse injected.
  • ...and 1 more figures