Efficient Eye-based Emotion Recognition via Neural Architecture Search of Time-to-First-Spike-Coded Spiking Neural Networks
Qianhui Liu, Jing Yang, Miao Yu, Trevor E. Carlson, Gang Pan, Haizhou Li, Zhumin Chen
TL;DR
This work tackles energy-efficient eye-based emotion recognition for wearables by optimizing Time-to-first-spike (TTFS) coded Spiking Neural Networks (SNNs) through a dedicated neural architecture search framework, TNAS-ER. TNAS-ER introduces an ANN-assisted search that leverages a ReLU ANN counterpart sharing an identity mapping with the TTFS SNN to stabilize training and guide architecture optimization via an evolutionary search, optimizing a joint WAR/UAR fitness. The framework achieves state-of-the-art accuracy with dramatically reduced parameters and operations, and demonstrates practical viability by deploying on neuromorphic hardware (YOSO) with low latency (48 ms) and energy (0.05 J). The results underscore the potential of architecture-level optimization to unlock efficient, real-time emotion recognition on resource-constrained wearables. These findings position TTFS SNN NAS as a promising route for energy-efficient, edge-based affective computing.
Abstract
Eye-based emotion recognition enables eyewear devices to perceive users' emotional states and support emotion-aware interaction, yet deploying such functionality on their resource-limited embedded hardware remains challenging. Time-to-first-spike (TTFS)-coded spiking neural networks (SNNs) offer a promising solution, as each neuron emits at most one binary spike, resulting in extremely sparse and energy-efficient computation. While prior works have primarily focused on improving TTFS SNN training algorithms, the impact of network architecture has been largely overlooked. In this paper, we propose TNAS-ER, the first neural architecture search (NAS) framework tailored to TTFS SNNs for eye-based emotion recognition. TNAS-ER presents a novel ANN-assisted search strategy that leverages a ReLU-based ANN counterpart sharing an identity mapping with the TTFS SNN to guide architecture optimization. TNAS-ER employs an evolutionary algorithm, with weighted and unweighted average recall jointly defined as fitness objectives for emotion recognition. Extensive experiments demonstrate that TNAS-ER achieves high recognition performance with significantly improved efficiency. Furthermore, when deployed on neuromorphic hardware, TNAS-ER attains a low latency of 48 ms and an energy consumption of 0.05 J, confirming its superior energy efficiency and strong potential for practical applications.
