Table of Contents
Fetching ...

Enabling Next-Generation Consumer Experience with Feature Coding for Machines

Md Eimran Hossain Eimon, Juan Merlos, Ashan Perera, Hari Kalva, Velibor Adzic, Borko Furht

TL;DR

The paper addresses the inefficiency of remote inference for large neural networks on resource-constrained devices by introducing MPEG's Feature Coding for Machines (FCM) as a standard to compress, transmit, and restore intermediate neural features for split inference. It presents a complete toolchain—feature reduction (including temporal down-sampling and multi-scale fusion with FENet), feature conversion (packing and quantization), feature encoding/decoding via VVC, and feature restoration (DRNet with multi-scale branches and temporal up-sampling)—to enable efficient offloading of neural computations. Experimental results show a substantial BD-rate reduction of 75.90% compared to remote inference across multiple tasks and datasets, while maintaining end-task accuracy. The study also analyzes computational complexity, finding encoder costs to be high relative to edge CNNs, suggesting future work toward task-agnostic, unified reduction/restoration to maximize offloading benefits.

Abstract

As consumer devices become increasingly intelligent and interconnected, efficient data transfer solutions for machine tasks have become essential. This paper presents an overview of the latest Feature Coding for Machines (FCM) standard, part of MPEG-AI and developed by the Moving Picture Experts Group (MPEG). FCM supports AI-driven applications by enabling the efficient extraction, compression, and transmission of intermediate neural network features. By offloading computationally intensive operations to base servers with high computing resources, FCM allows low-powered devices to leverage large deep learning models. Experimental results indicate that the FCM standard maintains the same level of accuracy while reducing bitrate requirements by 75.90% compared to remote inference.

Enabling Next-Generation Consumer Experience with Feature Coding for Machines

TL;DR

The paper addresses the inefficiency of remote inference for large neural networks on resource-constrained devices by introducing MPEG's Feature Coding for Machines (FCM) as a standard to compress, transmit, and restore intermediate neural features for split inference. It presents a complete toolchain—feature reduction (including temporal down-sampling and multi-scale fusion with FENet), feature conversion (packing and quantization), feature encoding/decoding via VVC, and feature restoration (DRNet with multi-scale branches and temporal up-sampling)—to enable efficient offloading of neural computations. Experimental results show a substantial BD-rate reduction of 75.90% compared to remote inference across multiple tasks and datasets, while maintaining end-task accuracy. The study also analyzes computational complexity, finding encoder costs to be high relative to edge CNNs, suggesting future work toward task-agnostic, unified reduction/restoration to maximize offloading benefits.

Abstract

As consumer devices become increasingly intelligent and interconnected, efficient data transfer solutions for machine tasks have become essential. This paper presents an overview of the latest Feature Coding for Machines (FCM) standard, part of MPEG-AI and developed by the Moving Picture Experts Group (MPEG). FCM supports AI-driven applications by enabling the efficient extraction, compression, and transmission of intermediate neural network features. By offloading computationally intensive operations to base servers with high computing resources, FCM allows low-powered devices to leverage large deep learning models. Experimental results indicate that the FCM standard maintains the same level of accuracy while reducing bitrate requirements by 75.90% compared to remote inference.

Paper Structure

This paper contains 21 sections, 5 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: (a) Remote Inference (b) Split Inference
  • Figure 2: Overview of Feature Coding for Machines (FCM)
  • Figure 3: (a) Input Image (b) Corresponding packed fused feature maps in YUV 4:0:0