Table of Contents
Fetching ...

Profiling AI Models: Towards Efficient Computation Offloading in Heterogeneous Edge AI Systems

Juan Marcelo Parra-Ullauri, Oscar Dilley, Hari Madhukumar, Dimitra Simeonidou

TL;DR

This work proposes a research roadmap focused on profiling AI models, capturing data about model types, hyperparameters, and underlying hardware to predict resource utilisation and task completion time, and shows promise in optimising resource allocation and enhancing Edge AI performance.

Abstract

The rapid growth of end-user AI applications, such as computer vision and generative AI, has led to immense data and processing demands often exceeding user devices' capabilities. Edge AI addresses this by offloading computation to the network edge, crucial for future services in 6G networks. However, it faces challenges such as limited resources during simultaneous offloads and the unrealistic assumption of homogeneous system architecture. To address these, we propose a research roadmap focused on profiling AI models, capturing data about model types, hyperparameters, and underlying hardware to predict resource utilisation and task completion time. Initial experiments with over 3,000 runs show promise in optimising resource allocation and enhancing Edge AI performance.

Profiling AI Models: Towards Efficient Computation Offloading in Heterogeneous Edge AI Systems

TL;DR

This work proposes a research roadmap focused on profiling AI models, capturing data about model types, hyperparameters, and underlying hardware to predict resource utilisation and task completion time, and shows promise in optimising resource allocation and enhancing Edge AI performance.

Abstract

The rapid growth of end-user AI applications, such as computer vision and generative AI, has led to immense data and processing demands often exceeding user devices' capabilities. Edge AI addresses this by offloading computation to the network edge, crucial for future services in 6G networks. However, it faces challenges such as limited resources during simultaneous offloads and the unrealistic assumption of homogeneous system architecture. To address these, we propose a research roadmap focused on profiling AI models, capturing data about model types, hyperparameters, and underlying hardware to predict resource utilisation and task completion time. Initial experiments with over 3,000 runs show promise in optimising resource allocation and enhancing Edge AI performance.

Paper Structure

This paper contains 10 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Research roadmap for profiling based computation offloading
  • Figure 2: Comparing error performance of different models for AI profiling.
  • Figure 3: Demonstrating the performance of the XGBoost (max depth=12, subsample=0.8) model in predicting FLOPS, macs and total time.