Table of Contents
Fetching ...

Covariance-Guided Resource Adaptive Learning for Efficient Edge Inference

Ahmad N. L. Nabhaan, Zaki Sukma, Rakandhiya D. Rachmanto, Muhammad Husni Santriaji, Byungjin Cho, Arief Setyanto, In Kee Kim

Abstract

For deep learning inference on edge devices, hardware configurations achieving the same throughput can differ by 2$\times$ in power consumption, yet operators often struggle to find the efficient ones without exhaustive profiling. Existing approaches often rely on inefficient static presets or require expensive offline profiling that must be repeated for each new model or device. To address this problem, we present CORAL, an online optimization method that discovers near-optimal configurations without offline profiling. CORAL leverages distance covariance to statistically capture the non-linear dependencies between hardware settings, e.g., DVFS and concurrency levels, and performance metrics. Unlike prior work, we explicitly formulate the challenge as a throughput-power co-optimization problem to satisfy power budgets and throughput targets simultaneously. We evaluate CORAL on two NVIDIA Jetson devices across three object detection models ranging from lightweight to heavyweight. In single-target scenarios, CORAL achieves 96% $\unicode{x2013}$ 100% of the optimal performance found by exhaustive search. In strict dual-constraint scenarios where baselines fail or exceed power budgets, CORAL consistently finds proper configurations online with minimal exploration.

Covariance-Guided Resource Adaptive Learning for Efficient Edge Inference

Abstract

For deep learning inference on edge devices, hardware configurations achieving the same throughput can differ by 2 in power consumption, yet operators often struggle to find the efficient ones without exhaustive profiling. Existing approaches often rely on inefficient static presets or require expensive offline profiling that must be repeated for each new model or device. To address this problem, we present CORAL, an online optimization method that discovers near-optimal configurations without offline profiling. CORAL leverages distance covariance to statistically capture the non-linear dependencies between hardware settings, e.g., DVFS and concurrency levels, and performance metrics. Unlike prior work, we explicitly formulate the challenge as a throughput-power co-optimization problem to satisfy power budgets and throughput targets simultaneously. We evaluate CORAL on two NVIDIA Jetson devices across three object detection models ranging from lightweight to heavyweight. In single-target scenarios, CORAL achieves 96% 100% of the optimal performance found by exhaustive search. In strict dual-constraint scenarios where baselines fail or exceed power budgets, CORAL consistently finds proper configurations online with minimal exploration.
Paper Structure (19 sections, 10 equations, 10 figures, 4 tables, 2 algorithms)

This paper contains 19 sections, 10 equations, 10 figures, 4 tables, 2 algorithms.

Figures (10)

  • Figure 1: Power-throughput trade-off of YOLOv5-N on Xavier NX and Orin Nano.The red box (Xavier-NX) highlights configurations with similar throughput but varying power consumption; the blue box (Orin-Nano) shows configurations achieving different throughputs at similar power levels.
  • Figure 2: Experimental Evaluation Loop for Coral and Baselines.
  • Figure 3: Power-throughput trade-off in single-constraint scenario (YOLO).
  • Figure 4: Single-constraint (throughput) comparison results (YOLO).
  • Figure 5: Dual-constraint optimization results (YOLO).
  • ...and 5 more figures