Table of Contents
Fetching ...

A Contrastive Learning Approach to Mitigate Bias in Speech Models

Alkis Koudounas, Flavio Giobergia, Eliana Pastor, Elena Baralis

TL;DR

A three-level learning technique is employed that guides the model in focusing on different scopes for the contrastive loss, i.e., task, subgroup, and the errors within subgroups, thus reducing model bias and enhancing performance.

Abstract

Speech models may be affected by performance imbalance in different population subgroups, raising concerns about fair treatment across these groups. Prior attempts to mitigate unfairness either focus on user-defined subgroups, potentially overlooking other affected subgroups, or do not explicitly improve the internal representation at the subgroup level. This paper proposes the first adoption of contrastive learning to mitigate speech model bias in underperforming subgroups. We employ a three-level learning technique that guides the model in focusing on different scopes for the contrastive loss, i.e., task, subgroup, and the errors within subgroups. The experiments on two spoken language understanding datasets and two languages demonstrate that our approach improves internal subgroup representations, thus reducing model bias and enhancing performance.

A Contrastive Learning Approach to Mitigate Bias in Speech Models

TL;DR

A three-level learning technique is employed that guides the model in focusing on different scopes for the contrastive loss, i.e., task, subgroup, and the errors within subgroups, thus reducing model bias and enhancing performance.

Abstract

Speech models may be affected by performance imbalance in different population subgroups, raising concerns about fair treatment across these groups. Prior attempts to mitigate unfairness either focus on user-defined subgroups, potentially overlooking other affected subgroups, or do not explicitly improve the internal representation at the subgroup level. This paper proposes the first adoption of contrastive learning to mitigate speech model bias in underperforming subgroups. We employ a three-level learning technique that guides the model in focusing on different scopes for the contrastive loss, i.e., task, subgroup, and the errors within subgroups. The experiments on two spoken language understanding datasets and two languages demonstrate that our approach improves internal subgroup representations, thus reducing model bias and enhancing performance.
Paper Structure (8 sections, 2 equations, 2 figures, 2 tables)

This paper contains 8 sections, 2 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Summary of the action of the three contrastive loss terms on a toy example, comprised of 2 subgroups (A, B) and a binary classification task (square/triangle).
  • Figure 2: FSC. 5 most-frequent intents (main color) and 3 most frequent subgroups (shades of the same color). t-SNE visualization of the original model (left) and CLUES (right). Correct samples reported as circles, incorrect ones as crosses. Best viewed in color.