SGPRS: Seamless GPU Partitioning Real-Time Scheduler for Periodic Deep Learning Workloads
Amir Fakhim Babaei, Thidapat Chantem
TL;DR
This work conducts GPU speedup gain analysis and proposes SGPRS, the first real-time GPU scheduler considering zero configuration partition switch, which not only meets more deadlines for parallel tasks but also sustains overall performance beyond the pivot point.
Abstract
Deep Neural Networks (DNNs) are useful in many applications, including transportation, healthcare, and speech recognition. Despite various efforts to improve accuracy, few works have studied DNN in the context of real-time requirements. Coarse resource allocation and sequential execution in existing frameworks result in underutilization. In this work, we conduct GPU speedup gain analysis and propose SGPRS, the first real-time GPU scheduler considering zero configuration partition switch. The proposed scheduler not only meets more deadlines for parallel tasks but also sustains overall performance beyond the pivot point.
