AI Sessions for Network-Exposed AI-as-a-Service

Mohaned Chraiti; Merve Saimler

AI Sessions for Network-Exposed AI-as-a-Service

Mohaned Chraiti, Merve Saimler

TL;DR

This paper proposes Network-Exposed AI-as-a-Service (NE-AIaaS) built around a new service primitive: the AI Session (AIS), a contractual object that binds model identity, execution placement, transport Quality-of-Service (QoS), and consent/charging scope into a single lifecycle with explicit failure semantics.

Abstract

Cloud-based Artificial Intelligence (AI) inference is increasingly latency- and context-sensitive, yet today's AI-as-a-Service is typically consumed as an application-chosen endpoint, leaving the network to provide only best-effort transport. This decoupling prevents enforceable tail-latency guarantees, compute-aware admission control, and continuity under mobility. This paper proposes Network-Exposed AI-as-a-Service (NE-AIaaS) built around a new service primitive: the AI Session (AIS)-a contractual object that binds model identity, execution placement, transport Quality-of-Service (QoS), and consent/charging scope into a single lifecycle with explicit failure semantics. We introduce the AI Service Profile (ASP), a compact contract that expresses task modality and measurable service objectives (e.g., time-to-first-response/token, p99 latency, success probability) alongside privacy and mobility constraints. On this basis, we specify protocol-grade procedures for (i) DISCOVER (model/site discovery), (ii) AI PAGING (context-aware selection of execution anchor), (iii) two-phase PREPARE/COMMIT that atomically co-reserves compute and QoS resources, and (iv) make-before-break MIGRATION for session continuity. The design is standard-mappable to Common API Framework (CAPIF) style northbound exposure, ETSI Multi-access Edge Computing (MEC) execution substrates, 5G QoS flows for transport enforcement, and Network Data Analytics Function (NWDAF) style analytics for closed-loop paging/migration triggers.

AI Sessions for Network-Exposed AI-as-a-Service

TL;DR

Abstract

Paper Structure (19 sections, 17 equations, 4 figures, 1 table)

This paper contains 19 sections, 17 equations, 4 figures, 1 table.

Introduction
Paper Outline
System Context and Design Constraints
Standard-Mappable Control Primitives
Gap to an Enforceable Contract: Requirements
AI Service Profile and AI Session Semantics
AI Service Profile: What Can Be Measured and What Is Admissible
AI Session: Binding the ASP to Enforceable Commitments
Derived Semantics: Well-Posed Admission, Falsifiable Compliance, and Continuity
NE-AIaaS Architecture and Protocol Procedures
Architecture as a Composition of Enforceable Planes
Procedures as Derived Transactions
Standardization and Deployment Roadmap
Standards Mapping and Interoperable Core
Deployment Path and Hard Open Problems
...and 4 more sections

Figures (4)

Figure 1: NE-AIaaS end-to-end workflow: discover and anchor a (model,site) binding, atomically co-reserve compute and QoS (PREPARE/COMMIT), serve, and migrate (make-before-break) under risk triggers.
Figure 2: p99 end-to-end latency vs. offered load: NE-AIaaS delays tail collapse via joint compute admission and QoS.
Figure 3: ASP violation probability vs. offered load: NE-AIaaS served-and-failed over admitted sessions.
Figure 4: Interruption probability vs. user speed: make-before-break migration preserves continuity versus teardown.

Theorems & Definitions (2)

Remark 1
Remark 2

AI Sessions for Network-Exposed AI-as-a-Service

TL;DR

Abstract

AI Sessions for Network-Exposed AI-as-a-Service

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (2)