On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems
Cristian Cioflan, Lukas Cavigelli, Manuele Rusci, Miguel de Prado, Luca Benini
TL;DR
This work presents a fully on-device domain adaptation approach for keyword spotting on ultra-low-power edge devices. By tailoring a pretrained, noise-robust backbone with on-site, noise-aware fine-tuning of a small learnable classifier, it achieves substantial accuracy gains (up to 14%) in unseen noisy environments while operating under strict memory (<10 kB) and energy constraints. The authors propose resource-aware strategies (partial freezing, data-subset training) and demonstrate the method on the GAP9 platform, accomplishing on-site adaptation in about 14 s with modest energy and memory footprints. This enables private, low-power, always-on keyword spotting with adaptive robustness to real-world noise. It contributes a practical TinyML workflow for on-device learning and domain adaptation with concrete hardware demonstrations.
Abstract
Keyword spotting accuracy degrades when neural networks are exposed to noisy environments. On-site adaptation to previously unseen noise is crucial to recovering accuracy loss, and on-device learning is required to ensure that the adaptation process happens entirely on the edge device. In this work, we propose a fully on-device domain adaptation system achieving up to 14% accuracy gains over already-robust keyword spotting models. We enable on-device learning with less than 10 kB of memory, using only 100 labeled utterances to recover 5% accuracy after adapting to the complex speech noise. We demonstrate that domain adaptation can be achieved on ultra-low-power microcontrollers with as little as 806 mJ in only 14 s on always-on, battery-operated devices.
