Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference

Changmin Jeon; Seonjun Kim; Juheon Yi; Youngki Lee

Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference

Changmin Jeon, Seonjun Kim, Juheon Yi, Youngki Lee

TL;DR

Mondrian, an edge system that enables high-performance object detection on high-resolution video streams by devise a novel Compressive Packed Inference to minimize per-pixel processing costs by selectively determining the necessary pixels to process and combining them to maximize processing parallelism.

Abstract

In this paper, we present Mondrian, an edge system that enables high-performance object detection on high-resolution video streams. Many lightweight models and system optimization techniques have been proposed for resource-constrained devices, but they do not fully utilize the potential of the accelerators over dynamic, high-resolution videos. To enable such capability, we devise a novel Compressive Packed Inference to minimize per-pixel processing costs by selectively determining the necessary pixels to process and combining them to maximize processing parallelism. In particular, our system quickly extracts ROIs and dynamically shrinks them, reflecting the effect of the fast-changing characteristics of objects and scenes. It then intelligently combines such scaled ROIs into large canvases to maximize the utilization of inference accelerators such as GPU. Evaluation across various datasets, models, and devices shows Mondrian outperforms state-of-the-art baselines (e.g., input rescaling, ROI extractions, ROI extractions+batching) by 15.0-19.7% higher accuracy, leading to $\times$6.65 higher throughput than frame-wise inference for processing various 1080p video streams. We will release the code after the paper review.

Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference

TL;DR

Abstract

6.65 higher throughput than frame-wise inference for processing various 1080p video streams. We will release the code after the paper review.

Paper Structure (32 sections, 1 equation, 17 figures, 4 tables)

This paper contains 32 sections, 1 equation, 17 figures, 4 tables.

Introduction
Motivation
Target Scenarios
Design Goals
Limitations of Prior Works
Lightweight Object Detectors
System Optimization Techniques
Mondrian Overview
Approach
Challenges
System Architecture
Mondrian Inference Pipeline
ROI Extractor
Hybrid ROI Scale Estimator
Operational Flow
...and 17 more sections

Figures (17)

Figure 1: Concept of Mondrian's Compressive Packed Inference. We extract ROIs from 20 FHD frames, scale and pack them into a single 1280$\times$1280 canvas without accuracy drop.
Figure 2: Example scenario of Mondrian: four-way surveillance camera capturing crowded public square.
Figure 3: Effect of input size on processing throughput. Pixel throughput means the number of processing pixels in a second.
Figure 4: Overview of Compressive Packed Inference.
Figure 5: Motivational study on the spatio-temporal variation of Safe area (MTA dataset mta).
...and 12 more figures

Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference

TL;DR

Abstract

Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference

Authors

TL;DR

Abstract

Table of Contents

Figures (17)