Efficient Feature Compression for Machines with Global Statistics Preservation
Md Eimran Hossain Eimon, Hyomin Choi, Fabien Racapé, Mateen Ulhaq, Velibor Adzic, Hari Kalva, Borko Furht
TL;DR
This work addresses the challenge of compressing intermediate features in split-inference architectures without sacrificing task accuracy. It introduces a Z-score normalization–based scaling mechanism that preserves global statistics by signaling per-frame feature statistics (with periodic refresh), integrated into MPEG's Feature Coding for Machines framework. A simplified signaling variant further reduces overhead by modeling the sum of features as Gaussian and signaling a single parameter set in bf16. Across CTTC benchmarks, the approach achieves substantial bitrate reductions (average ~13–17% depending on variant) while maintaining or improving end-task performance, with especially large gains in object tracking tasks.
Abstract
The split-inference paradigm divides an artificial intelligence (AI) model into two parts. This necessitates the transfer of intermediate feature data between the two halves. Here, effective compression of the feature data becomes vital. In this paper, we employ Z-score normalization to efficiently recover the compressed feature data at the decoder side. To examine the efficacy of our method, the proposed method is integrated into the latest Feature Coding for Machines (FCM) codec standard under development by the Moving Picture Experts Group (MPEG). Our method supersedes the existing scaling method used by the current standard under development. It both reduces the overhead bits and improves the end-task accuracy. To further reduce the overhead in certain circumstances, we also propose a simplified method. Experiments show that using our proposed method shows 17.09% reduction in bitrate on average across different tasks and up to 65.69% for object tracking without sacrificing the task accuracy.
