Table of Contents
Fetching ...

MCUCoder: Adaptive Bitrate Learned Video Compression for IoT Devices

Ali Hojjat, Janek Haberer, Olaf Landsiedel

TL;DR

The paper addresses efficient video compression for IoT edge devices with severe memory and bandwidth constraints. It proposes MCUCoder, an ultra-lightweight adaptive bitrate encoder that orders latent channels by importance via biased dropout training and transmits the first k channels according to available bandwidth, paired with INT8 quantization for CMSIS-NN acceleration. The approach yields substantial bitrate reductions (about 55% in MS-SSIM BD-rate on MCL-JCV and UVG) while maintaining energy efficiency comparable to JPEG, and supports progressive bitrate streaming suitable for fluctuating networks. This work demonstrates the feasibility of real-time, adaptive video transmission on MCU-class hardware and provides open-source code for deployment on edge devices.

Abstract

The rapid growth of camera-based IoT devices demands the need for efficient video compression, particularly for edge applications where devices face hardware constraints, often with only 1 or 2 MB of RAM and unstable internet connections. Traditional and deep video compression methods are designed for high-end hardware, exceeding the capabilities of these constrained devices. Consequently, video compression in these scenarios is often limited to M-JPEG due to its high hardware efficiency and low complexity. This paper introduces , an open-source adaptive bitrate video compression model tailored for resource-limited IoT settings. MCUCoder features an ultra-lightweight encoder with only 10.5K parameters and a minimal 350KB memory footprint, making it well-suited for edge devices and MCUs. While MCUCoder uses a similar amount of energy as M-JPEG, it reduces bitrate by 55.65% on the MCL-JCV dataset and 55.59% on the UVG dataset, measured in MS-SSIM. Moreover, MCUCoder supports adaptive bitrate streaming by generating a latent representation that is sorted by importance, allowing transmission based on available bandwidth. This ensures smooth real-time video transmission even under fluctuating network conditions on low-resource devices. Source code available at https://github.com/ds-kiel/MCUCoder.

MCUCoder: Adaptive Bitrate Learned Video Compression for IoT Devices

TL;DR

The paper addresses efficient video compression for IoT edge devices with severe memory and bandwidth constraints. It proposes MCUCoder, an ultra-lightweight adaptive bitrate encoder that orders latent channels by importance via biased dropout training and transmits the first k channels according to available bandwidth, paired with INT8 quantization for CMSIS-NN acceleration. The approach yields substantial bitrate reductions (about 55% in MS-SSIM BD-rate on MCL-JCV and UVG) while maintaining energy efficiency comparable to JPEG, and supports progressive bitrate streaming suitable for fluctuating networks. This work demonstrates the feasibility of real-time, adaptive video transmission on MCU-class hardware and provides open-source code for deployment on edge devices.

Abstract

The rapid growth of camera-based IoT devices demands the need for efficient video compression, particularly for edge applications where devices face hardware constraints, often with only 1 or 2 MB of RAM and unstable internet connections. Traditional and deep video compression methods are designed for high-end hardware, exceeding the capabilities of these constrained devices. Consequently, video compression in these scenarios is often limited to M-JPEG due to its high hardware efficiency and low complexity. This paper introduces , an open-source adaptive bitrate video compression model tailored for resource-limited IoT settings. MCUCoder features an ultra-lightweight encoder with only 10.5K parameters and a minimal 350KB memory footprint, making it well-suited for edge devices and MCUs. While MCUCoder uses a similar amount of energy as M-JPEG, it reduces bitrate by 55.65% on the MCL-JCV dataset and 55.59% on the UVG dataset, measured in MS-SSIM. Moreover, MCUCoder supports adaptive bitrate streaming by generating a latent representation that is sorted by importance, allowing transmission based on available bandwidth. This ensures smooth real-time video transmission even under fluctuating network conditions on low-resource devices. Source code available at https://github.com/ds-kiel/MCUCoder.

Paper Structure

This paper contains 10 sections, 1 equation, 15 figures, 2 tables.

Figures (15)

  • Figure 1: Qualitative comparison of MCUCoder and across various compression rates on two videos from the MCL-JCV mcl_jcv and UVG mercat2020uvg datasets. As we can see, MCUCoder offers a significantly better MS-SSIM/bpp trade-off. For instance, at 0.15 bpp in the left example, with MCUCoder we can see the person's face whereas with we need at least 0.34 bpp to make out the face. Note that the images in each column do not necessarily have the same bitrate. More examples are reported in Appendix \ref{['appenix_D']}.
  • Figure 1: MCUCoder (Quantized) BD-rate results. The anchor is .
  • Figure 2: Number of parameters of MCUCoder and other learned image compression balle2018variationaltoderici2017fulllee2022dpictjeon2023contextLiu_2023_CVPRxie2021enhancedzhu2022transformer and video compression models agustsson2020scalelu2019dvc.
  • Figure 3: Overview of MCUCoder architecture. The encoder compresses the input frame into a sorted latent space. Afterward, channels are independently quantized and transmitted based on available bandwidth. The decoder reconstructs the frame by zeroing out missing channels.
  • Figure 4: MCUCoder latent channels: Early channels (important ones) capture low-frequency features, while later channels capture high-frequency features, similar to the DCT in JPEG.
  • ...and 10 more figures