Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning

Enrico Russo; Francesco Giulio Blanco; Maurizio Palesi; Giuseppe Ascia; Davide Patti; Vincenzo Catania

Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning

Enrico Russo, Francesco Giulio Blanco, Maurizio Palesi, Giuseppe Ascia, Davide Patti, Vincenzo Catania

TL;DR

A novel online scheduling algorithm for Deep Neural Networks in multi-accelerator systems is proposed, with a focus on guaranteeing tenant-wise, model-specific QoS levels while considering real-time constraints.

Abstract

This paper addresses the critical challenge of managing Quality of Service (QoS) in cloud services, focusing on the nuances of individual tenant expectations and varying Service Level Indicators (SLIs). It introduces a novel approach utilizing Deep Reinforcement Learning for tenant-specific QoS management in multi-tenant, multi-accelerator cloud environments. The chosen SLI, deadline hit rate, allows clients to tailor QoS for each service request. A novel online scheduling algorithm for Deep Neural Networks in multi-accelerator systems is proposed, with a focus on guaranteeing tenant-wise, model-specific QoS levels while considering real-time constraints.

Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning

TL;DR

Abstract

Paper Structure (8 sections, 3 figures)

This paper contains 8 sections, 3 figures.

Introduction
Related work
Problem formulation and proposed solution
Experiments
Use Case 1: Fairness
Use Case 2: Towards Firm Real Time Execution
Energy Overhead
Conclusion

Figures (3)

Figure 1: Overview of the proposed approach
Figure 2: Box plot representing the SLO Achievement Rate distribution among the tenants with different scheduling approaches.
Figure 3: Swarm plot of the differences between the target and actual SLO achievement rate for each tenant. A positive difference indicates that the tenant's SLA was upheld.

Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning

TL;DR

Abstract

Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)