Table of Contents
Fetching ...

Building AI Service Repositories for On-Demand Service Orchestration in 6G AI-RAN

Yun Tang, Mengbang Zou, Udhaya Chandhar Srinivasan, Obumneme Umealor, Dennis Kevogo, Benjamin James Scott, Weisi Guo

TL;DR

The paper addresses the challenge of orchestrating AI services in 6G AI-RAN by proposing a comprehensive, open-source, LLM-assisted toolchain to automate packaging, deployment, and runtime profiling of AI services. It provides a taxonomy of orchestration attributes (Functionality, Resource, Latency, Flexibility, Trustworthiness, Billing) and demonstrates how the toolchain extracts metadata, generates deployment-ready code, and profiles runtime behavior. The Cranfield AI Service Repository serves as a proof-of-concept, showing significant reductions in manual coding and the need for infrastructure-aware profiling. The work enables more practical, scalable on-demand AI service orchestration in 6G environments and lays groundwork for future extensions including safety, security, privacy, and federated AI workflows.

Abstract

Efficient orchestration of AI services in 6G AI-RAN requires well-structured, ready-to-deploy AI service repositories combined with orchestration methods adaptive to diverse runtime contexts across radio access, edge, and cloud layers. Current literature lacks comprehensive frameworks for constructing such repositories and generally overlooks key practical orchestration factors. This paper systematically identifies and categorizes critical attributes influencing AI service orchestration in 6G networks and introduces an open-source, LLM-assisted toolchain that automates service packaging, deployment, and runtime profiling. We validate the proposed toolchain through the Cranfield AI Service repository case study, demonstrating significant automation benefits, reduced manual coding efforts, and the necessity of infrastructure-specific profiling, paving the way for more practical orchestration frameworks.

Building AI Service Repositories for On-Demand Service Orchestration in 6G AI-RAN

TL;DR

The paper addresses the challenge of orchestrating AI services in 6G AI-RAN by proposing a comprehensive, open-source, LLM-assisted toolchain to automate packaging, deployment, and runtime profiling of AI services. It provides a taxonomy of orchestration attributes (Functionality, Resource, Latency, Flexibility, Trustworthiness, Billing) and demonstrates how the toolchain extracts metadata, generates deployment-ready code, and profiles runtime behavior. The Cranfield AI Service Repository serves as a proof-of-concept, showing significant reductions in manual coding and the need for infrastructure-aware profiling. The work enables more practical, scalable on-demand AI service orchestration in 6G environments and lays groundwork for future extensions including safety, security, privacy, and federated AI workflows.

Abstract

Efficient orchestration of AI services in 6G AI-RAN requires well-structured, ready-to-deploy AI service repositories combined with orchestration methods adaptive to diverse runtime contexts across radio access, edge, and cloud layers. Current literature lacks comprehensive frameworks for constructing such repositories and generally overlooks key practical orchestration factors. This paper systematically identifies and categorizes critical attributes influencing AI service orchestration in 6G networks and introduces an open-source, LLM-assisted toolchain that automates service packaging, deployment, and runtime profiling. We validate the proposed toolchain through the Cranfield AI Service repository case study, demonstrating significant automation benefits, reduced manual coding efforts, and the necessity of infrastructure-specific profiling, paving the way for more practical orchestration frameworks.

Paper Structure

This paper contains 17 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: LLM-assisted AI service repository toolchain overview.
  • Figure 2: Lines of code comparison for the first nine AI services (for image classification) in Cranfield AI Service Repository. The pipeline has been developed using the model microsoft/resnet-50 as a reference, hence zero manual revisions counted for the model.
  • Figure 3: Resource profile results of the first nine AI services. (a-e) are profiled on a CPU-only laptop connected to WIFI and (f-j) on an RTX A4000 GPU-powered workstation with a faster ethernet connection. The candlestick plots for XAI response time are generated from all compatible XAI techniques for each AI service.