Polymorph: Energy-Efficient Multi-Label Classification for Video Streams on Embedded Devices
Saeid Ghafouri, Mohsen Fayyaz, Xiangchen Li, Deepu John, Bo Ji, Dimitrios Nikolopoulos, Hans Vandierendonck
TL;DR
Polymorph tackles on-device, real-time multi-label video classification by exploiting the structural properties of video streams—label sparsity, temporal continuity, and label co-occurrence. It introduces context-aware LoRA adapters deployed on a shared backbone, with a two-stage process: training-time co-occurrence clustering to form compact label contexts, and inference-time greedy context detection to activate a minimal, cover-ensuring set of adapters per frame. By applying LoRA only to the final layers and using a parallel, composable forward without merging base weights, Polymorph achieves substantial energy efficiency and accuracy gains on the TAO benchmark (approximately $40\%$ energy reduction and $+9$ mAP points) while maintaining real-time latency on embedded hardware. This approach offers a scalable, Flexible framework for edge video analytics, enabling efficient handling of large label spaces without full-model switching or duplication.
Abstract
Real-time multi-label video classification on embedded devices is constrained by limited compute and energy budgets. Yet, video streams exhibit structural properties such as label sparsity, temporal continuity, and label co-occurrence that can be leveraged for more efficient inference. We introduce Polymorph, a context-aware framework that activates a minimal set of lightweight Low Rank Adapters (LoRA) per frame. Each adapter specializes in a subset of classes derived from co-occurrence patterns and is implemented as a LoRA weight over a shared backbone. At runtime, Polymorph dynamically selects and composes only the adapters needed to cover the active labels, avoiding full-model switching and weight merging. This modular strategy improves scalability while reducing latency and energy overhead. Polymorph achieves 40% lower energy consumption and improves mAP by 9 points over strong baselines on the TAO dataset. Polymorph is open source at https://github.com/inference-serving/polymorph/.
