RTGPU: Real-Time Computing with Graphics Processing Units
Atiyeh Gheibi-Fetrat, Amirsaeed Ahmadi-Tonekaboni, Farzam Koohi-Ronaghi, Pariya Hajipour, Sana Babayan-Vanestan, Fatemeh Fotouhi, Elahe Mortazavian-Farsani, Pouria Khajehpour-Dezfouli, Sepideh Safari, Shaahin Hessabi, Hamid Sarbazi-Azad
TL;DR
This work surveys the integration of GPUs into real-time and latency-sensitive systems, identifying non-preemptive execution, execution-time variability, resource contention, power, and context-switch overhead as key challenges. It categorizes scheduling strategies into explicit real-time approaches (non-preemptive and preemptive, spanning software, OS, hardware, and hybrid methods) and implicit real-time scheduling, and it reviews architectural and software techniques such as spatial multitasking, MPS, and specialized schedulers. The analysis highlights representative methods (TimeGraph, Gdev, STGM, EffiSha, HeteroSync) and discusses open directions for preemption, resource management, and energy efficiency. The findings underscore the practical importance of developing adaptive, hardware-aware strategies to make GPUs reliably predictable in safety-critical and latency-sensitive domains like autonomous systems, robotics, and automotive applications.
Abstract
In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to their high computational throughput. Their parallel architecture is well-suited for accelerating complex tasks under strict timing constraints. However, their integration into real-time systems presents several challenges, including non-preemptive execution, execution time variability, and resource contention; factors that can lead to unpredictable delays and deadline violations. We examine existing solutions that address these challenges, including scheduling algorithms, resource management techniques, and synchronization methods, and highlight open research directions to improve GPU predictability and performance in real-time environments.
