Table of Contents
Fetching ...

Efficient Feature Compression for Machines with Global Statistics Preservation

Md Eimran Hossain Eimon, Hyomin Choi, Fabien Racapé, Mateen Ulhaq, Velibor Adzic, Hari Kalva, Borko Furht

TL;DR

This work addresses the challenge of compressing intermediate features in split-inference architectures without sacrificing task accuracy. It introduces a Z-score normalization–based scaling mechanism that preserves global statistics by signaling per-frame feature statistics (with periodic refresh), integrated into MPEG's Feature Coding for Machines framework. A simplified signaling variant further reduces overhead by modeling the sum of features as Gaussian and signaling a single parameter set in bf16. Across CTTC benchmarks, the approach achieves substantial bitrate reductions (average ~13–17% depending on variant) while maintaining or improving end-task performance, with especially large gains in object tracking tasks.

Abstract

The split-inference paradigm divides an artificial intelligence (AI) model into two parts. This necessitates the transfer of intermediate feature data between the two halves. Here, effective compression of the feature data becomes vital. In this paper, we employ Z-score normalization to efficiently recover the compressed feature data at the decoder side. To examine the efficacy of our method, the proposed method is integrated into the latest Feature Coding for Machines (FCM) codec standard under development by the Moving Picture Experts Group (MPEG). Our method supersedes the existing scaling method used by the current standard under development. It both reduces the overhead bits and improves the end-task accuracy. To further reduce the overhead in certain circumstances, we also propose a simplified method. Experiments show that using our proposed method shows 17.09% reduction in bitrate on average across different tasks and up to 65.69% for object tracking without sacrificing the task accuracy.

Efficient Feature Compression for Machines with Global Statistics Preservation

TL;DR

This work addresses the challenge of compressing intermediate features in split-inference architectures without sacrificing task accuracy. It introduces a Z-score normalization–based scaling mechanism that preserves global statistics by signaling per-frame feature statistics (with periodic refresh), integrated into MPEG's Feature Coding for Machines framework. A simplified signaling variant further reduces overhead by modeling the sum of features as Gaussian and signaling a single parameter set in bf16. Across CTTC benchmarks, the approach achieves substantial bitrate reductions (average ~13–17% depending on variant) while maintaining or improving end-task performance, with especially large gains in object tracking tasks.

Abstract

The split-inference paradigm divides an artificial intelligence (AI) model into two parts. This necessitates the transfer of intermediate feature data between the two halves. Here, effective compression of the feature data becomes vital. In this paper, we employ Z-score normalization to efficiently recover the compressed feature data at the decoder side. To examine the efficacy of our method, the proposed method is integrated into the latest Feature Coding for Machines (FCM) codec standard under development by the Moving Picture Experts Group (MPEG). Our method supersedes the existing scaling method used by the current standard under development. It both reduces the overhead bits and improves the end-task accuracy. To further reduce the overhead in certain circumstances, we also propose a simplified method. Experiments show that using our proposed method shows 17.09% reduction in bitrate on average across different tasks and up to 65.69% for object tracking without sacrificing the task accuracy.

Paper Structure

This paper contains 12 sections, 3 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: An example of the split-inference approach for segmentation with feature coding for machines (FCM) encoder and decoder.
  • Figure 2: A brief overview of FCM codec pipeline.
  • Figure 3: Modified FCM decoder with our method for global statistics preservation. Colored blocks are newly added or changed by introducing the proposed method.
  • Figure 4: Comparison of rate-accuracy curves between various coding methods on different tasks. Note, the legend is shared.
  • Figure 5: Visual comparison of inference results overlaid on the input frame. First column shows the ground truth. The second and third demonstrate the inference results with compressed feature with FCTM-3.2 and ours, respectively. First two rows present tracking outputs on TVD-03 and HiEve-13 at the sample frame with the number in the bracket, respectively. The last row shows the detected bus on a video sequence, Traffic. Overall coding performance for each sequence is presented in [kbps, MOTA/mAP].