Table of Contents
Fetching ...

Dynamic Switch Layers For Unsupervised Learning

Haiguang Li, Usama Pervaiz, Michał Matuszak, Robert Kamara, Gilles Roux, Trausti Thormundsson, Joseph Antognini

TL;DR

The Dynamic Switch Layer (DSL) addresses the power/compute bottleneck of on-device learning by enabling unsupervised, data-driven path routing through a lightweight decoder. By generalizing Gated Compression layers, the DSL induces activation sparsity and dynamic routing, maintaining accuracy while dramatically reducing model size and computation. Integrated into SoundStream, the DSL routes up to 80% of samples to a lightweight path, achieving a 20.9x reduction in parameters and a 12.3x reduction in compute, with up to 26.5% lower latency and 21.4% better power efficiency, without sacrificing downstream performance. These results demonstrate practical impact for real-world, energy-constrained applications and establish the DSL as a versatile building block for efficient on-device unsupervised learning.

Abstract

On-device machine learning (ODML) enables intelligent applications on resource-constrained devices. However, power consumption poses a major challenge, forcing a trade-off between model accuracy and power efficiency that often limits model complexity. The previously established Gated Compression (GC) layers offer a solution, enabling power efficiency without sacrificing model performance by selectively gating samples that lack signals of interest. However, their reliance on ground truth labels limits GC layers to supervised tasks. This work introduces the Dynamic Switch Layer (DSL), extending the benefits of GC layers to unsupervised learning scenarios, and maintaining power efficiency without the need for labeled data. The DSL builds upon the GC architecture, leveraging a dynamic pathway selection, and adapting model complexity in response to the innate structure of the data. We integrate the DSL into the SoundStream architecture and demonstrate that by routing up to 80% of samples through a lightweight pass we achieve a 12.3x reduction in the amount of computation performed and a 20.9x reduction in model size. This reduces the on-device inference latency by up to 26.5% and improves power efficiency by up to 21.4% without impacting model performance.

Dynamic Switch Layers For Unsupervised Learning

TL;DR

The Dynamic Switch Layer (DSL) addresses the power/compute bottleneck of on-device learning by enabling unsupervised, data-driven path routing through a lightweight decoder. By generalizing Gated Compression layers, the DSL induces activation sparsity and dynamic routing, maintaining accuracy while dramatically reducing model size and computation. Integrated into SoundStream, the DSL routes up to 80% of samples to a lightweight path, achieving a 20.9x reduction in parameters and a 12.3x reduction in compute, with up to 26.5% lower latency and 21.4% better power efficiency, without sacrificing downstream performance. These results demonstrate practical impact for real-world, energy-constrained applications and establish the DSL as a versatile building block for efficient on-device unsupervised learning.

Abstract

On-device machine learning (ODML) enables intelligent applications on resource-constrained devices. However, power consumption poses a major challenge, forcing a trade-off between model accuracy and power efficiency that often limits model complexity. The previously established Gated Compression (GC) layers offer a solution, enabling power efficiency without sacrificing model performance by selectively gating samples that lack signals of interest. However, their reliance on ground truth labels limits GC layers to supervised tasks. This work introduces the Dynamic Switch Layer (DSL), extending the benefits of GC layers to unsupervised learning scenarios, and maintaining power efficiency without the need for labeled data. The DSL builds upon the GC architecture, leveraging a dynamic pathway selection, and adapting model complexity in response to the innate structure of the data. We integrate the DSL into the SoundStream architecture and demonstrate that by routing up to 80% of samples through a lightweight pass we achieve a 12.3x reduction in the amount of computation performed and a 20.9x reduction in model size. This reduces the on-device inference latency by up to 26.5% and improves power efficiency by up to 21.4% without impacting model performance.
Paper Structure (23 sections, 5 equations, 8 figures, 1 table)

This paper contains 23 sections, 5 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: The Proposed Dynamic Switch Layer (DSL): (1)Mask for activation sparsity, (2)Switch for path switching/routing, and (3)Lightweight Decoder for output generating from the lightweight branch.
  • Figure 2: Power-Efficiency Enhancement in SoundStream Encoder with DSL Integration. The diagram illustrates the reduction in parameter count and computational load when employing the lightweight pass.
  • Figure 3: Switch Performance for Dynamic Routing: (a) Switch predicted values on speech data and silence data, (b) Switch performance by routing speech samples to different passes, and (c) Switch performance by routing silence samples to different passes.
  • Figure 4: Dynamic Routing of Audio Segments in DSL. Each example shows the Switch's predictive behaviour over time (left) and the corresponding spectrograms with Visqol scores for full, lightweight and mixed pass output (right).
  • Figure 5: Performance for Downstream Use-Cases in Keyword and Speaker Detection. Performance comparison across original full pass, lightweight pass, and the DSL's mixed pass, highlighting the efficiency of dynamic routing in maintaining or enhancing audio quality.
  • ...and 3 more figures