Table of Contents
Fetching ...

TinySeg: Model Optimizing Framework for Image Segmentation on Tiny Embedded Systems

Byungchul Chae, Jiae Kim, Seonyeong Heo

TL;DR

TinySeg addresses the memory bottleneck of image segmentation on tiny embedded systems by introducing a model optimizing framework comprised of a Cold Range Analyzer and Graph Transformer that perform tensor spilling, fetching, and fusion. The accompanying runtime implements dynamic tensor compression, asynchronous block operations, and temporary tensor quantization to minimize peak memory and data-transfer overhead. Empirical results show up to a $39.3\%$ reduction in peak memory usage on a Tiny U-Net model, with trade-offs in latency depending on spill options but overall feasible deployment on a 1 MB memory device. The work argues that memory-aware optimization is essential for enabling smarter, low-power embedded segmentation and demonstrates practical integration with TensorFlow Lite for Microcontrollers. The framework is extensible with existing model compression techniques and varying storage options, facilitating broader applicability.

Abstract

Image segmentation is one of the major computer vision tasks, which is applicable in a variety of domains, such as autonomous navigation of an unmanned aerial vehicle. However, image segmentation cannot easily materialize on tiny embedded systems because image segmentation models generally have high peak memory usage due to their architectural characteristics. This work finds that image segmentation models unnecessarily require large memory space with an existing tiny machine learning framework. That is, the existing framework cannot effectively manage the memory space for the image segmentation models. This work proposes TinySeg, a new model optimizing framework that enables memory-efficient image segmentation for tiny embedded systems. TinySeg analyzes the lifetimes of tensors in the target model and identifies long-living tensors. Then, TinySeg optimizes the memory usage of the target model mainly with two methods: (i) tensor spilling into local or remote storage and (ii) fused fetching of spilled tensors. This work implements TinySeg on top of the existing tiny machine learning framework and demonstrates that TinySeg can reduce the peak memory usage of an image segmentation model by 39.3% for tiny embedded systems.

TinySeg: Model Optimizing Framework for Image Segmentation on Tiny Embedded Systems

TL;DR

TinySeg addresses the memory bottleneck of image segmentation on tiny embedded systems by introducing a model optimizing framework comprised of a Cold Range Analyzer and Graph Transformer that perform tensor spilling, fetching, and fusion. The accompanying runtime implements dynamic tensor compression, asynchronous block operations, and temporary tensor quantization to minimize peak memory and data-transfer overhead. Empirical results show up to a reduction in peak memory usage on a Tiny U-Net model, with trade-offs in latency depending on spill options but overall feasible deployment on a 1 MB memory device. The work argues that memory-aware optimization is essential for enabling smarter, low-power embedded segmentation and demonstrates practical integration with TensorFlow Lite for Microcontrollers. The framework is extensible with existing model compression techniques and varying storage options, facilitating broader applicability.

Abstract

Image segmentation is one of the major computer vision tasks, which is applicable in a variety of domains, such as autonomous navigation of an unmanned aerial vehicle. However, image segmentation cannot easily materialize on tiny embedded systems because image segmentation models generally have high peak memory usage due to their architectural characteristics. This work finds that image segmentation models unnecessarily require large memory space with an existing tiny machine learning framework. That is, the existing framework cannot effectively manage the memory space for the image segmentation models. This work proposes TinySeg, a new model optimizing framework that enables memory-efficient image segmentation for tiny embedded systems. TinySeg analyzes the lifetimes of tensors in the target model and identifies long-living tensors. Then, TinySeg optimizes the memory usage of the target model mainly with two methods: (i) tensor spilling into local or remote storage and (ii) fused fetching of spilled tensors. This work implements TinySeg on top of the existing tiny machine learning framework and demonstrates that TinySeg can reduce the peak memory usage of an image segmentation model by 39.3% for tiny embedded systems.
Paper Structure (25 sections, 2 equations, 15 figures, 5 tables, 3 algorithms)

This paper contains 25 sections, 2 equations, 15 figures, 5 tables, 3 algorithms.

Figures (15)

  • Figure 1: U-Net Network Architecture ronneberger:2015:miccai. Sample Images are Taken from the Cityscapes Dataset cordts:2016:cvpr.
  • Figure 2: Neural Network Model in a Graph Representation.
  • Figure 3: Memory Usage of an Image Segmentation Model.
  • Figure 4: Overview of the TinySeg Optimizing Framework.
  • Figure 5: Example of Cold Range Analysis.
  • ...and 10 more figures