Table of Contents
Fetching ...

YOLIC: An Efficient Method for Object Localization and Classification on Edge Devices

Kai Su, Yoichi Tomioka, Qiangfu Zhao, Yong Liu

TL;DR

YOLIC proposes a bounding-box-free, CoI-based object localization and classification method optimized for edge devices, blending semantic segmentation with lightweight detection. By using predefined Cells of Interest and a multi-label per-CoI head, it delivers real-time performance with competitive accuracy, avoiding bounding-box regression and NMS. The approach is validated across outdoor, indoor, and urban datasets, including Cityscapes, demonstrating robust generalization and substantial speed advantages on Raspberry Pi hardware, with quantization-aware training further enhancing efficiency. The work highlights practical benefits for IoT and autonomous systems, offering configurable CoI layouts, transferable backbones, and publicly available resources to facilitate deployment and reproduction.

Abstract

In the realm of Tiny AI, we introduce ``You Only Look at Interested Cells" (YOLIC), an efficient method for object localization and classification on edge devices. Through seamlessly blending the strengths of semantic segmentation and object detection, YOLIC offers superior computational efficiency and precision. By adopting Cells of Interest for classification instead of individual pixels, YOLIC encapsulates relevant information, reduces computational load, and enables rough object shape inference. Importantly, the need for bounding box regression is obviated, as YOLIC capitalizes on the predetermined cell configuration that provides information about potential object location, size, and shape. To tackle the issue of single-label classification limitations, a multi-label classification approach is applied to each cell for effectively recognizing overlapping or closely situated objects. This paper presents extensive experiments on multiple datasets to demonstrate that YOLIC achieves detection performance comparable to the state-of-the-art YOLO algorithms while surpassing in speed, exceeding 30fps on a Raspberry Pi 4B CPU. All resources related to this study, including datasets, cell designer, image annotation tool, and source code, have been made publicly available on our project website at https://kai3316.github.io/yolic.github.io

YOLIC: An Efficient Method for Object Localization and Classification on Edge Devices

TL;DR

YOLIC proposes a bounding-box-free, CoI-based object localization and classification method optimized for edge devices, blending semantic segmentation with lightweight detection. By using predefined Cells of Interest and a multi-label per-CoI head, it delivers real-time performance with competitive accuracy, avoiding bounding-box regression and NMS. The approach is validated across outdoor, indoor, and urban datasets, including Cityscapes, demonstrating robust generalization and substantial speed advantages on Raspberry Pi hardware, with quantization-aware training further enhancing efficiency. The work highlights practical benefits for IoT and autonomous systems, offering configurable CoI layouts, transferable backbones, and publicly available resources to facilitate deployment and reproduction.

Abstract

In the realm of Tiny AI, we introduce ``You Only Look at Interested Cells" (YOLIC), an efficient method for object localization and classification on edge devices. Through seamlessly blending the strengths of semantic segmentation and object detection, YOLIC offers superior computational efficiency and precision. By adopting Cells of Interest for classification instead of individual pixels, YOLIC encapsulates relevant information, reduces computational load, and enables rough object shape inference. Importantly, the need for bounding box regression is obviated, as YOLIC capitalizes on the predetermined cell configuration that provides information about potential object location, size, and shape. To tackle the issue of single-label classification limitations, a multi-label classification approach is applied to each cell for effectively recognizing overlapping or closely situated objects. This paper presents extensive experiments on multiple datasets to demonstrate that YOLIC achieves detection performance comparable to the state-of-the-art YOLO algorithms while surpassing in speed, exceeding 30fps on a Raspberry Pi 4B CPU. All resources related to this study, including datasets, cell designer, image annotation tool, and source code, have been made publicly available on our project website at https://kai3316.github.io/yolic.github.io
Paper Structure (19 sections, 2 equations, 9 figures, 9 tables)

This paper contains 19 sections, 2 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: YOLIC's adaptability is showcased in diverse applications such as intelligent driving, industrial manufacturing, and smart parking, where it proficiently identifies various cells of interest. Unlike conventional object detection algorithms which involve laborious searching for objects in the entire image, the proposed method passively waits for the object to appear in the predefined cells of interest. This flexibility enables customized cell configurations for unique scenarios, allowing precise object detection and analysis across different tasks.
  • Figure 2: A detailed diagram of the YOLIC model, showcasing its two main components: the feature extraction module and the multi-label classification head. The figure also demonstrates the output of the model, highlighting the position of each CoI on an input image. The overall structure emphasizes YOLIC's lightweight and efficient design, which makes it well-suited for edge devices with limited computational resources.
  • Figure 3: Cell configuration designed for outdoor risk detection on electric scooters, illustrating the distribution of 96 CoIs for road hazards within a 0-6 meter range and eight additional CoIs for traffic sign localization.
  • Figure 4: We compare the detection results of our proposed models with the YOLO by visualizing the output on sample outdoor images. The four models with the highest F1-scores are selected for the comparison.
  • Figure 5: Cell configuration for indoor obstacle detection experiment, showcasing 30 irregularly-shaped CoIs distributed across the video frame.
  • ...and 4 more figures