Table of Contents
Fetching ...

Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions

Qifan Wang, David Oswald

TL;DR

This paper surveys confidential computing on heterogeneous CPU-GPU systems, focusing on GPU TEEs, attack surfaces, and future directions. It surveys a spectrum of GPU-TEE designs across x86, Arm, and vendor-enabled platforms (e.g., NVIDIA H100), analyzes threat models, hardware modifications, TCB size, and overheads, and maps CPU TEE attacks to GPU contexts. The authors catalog architectural, microarchitectural, physical, and fault-injection attacks on GPUs, synthesize countermeasures, and discuss how design choices (e.g., memory encryption, VM-based isolation, or software-based verifications) influence security and practicality. The work emphasizes the need for architecture-aware defenses, rigorous analysis of GPU memory and interconnect vulnerabilities, and continued exploration of secure GPU design to enable secure high-performance computing in cloud and data-center contexts.

Abstract

In recent years, the widespread informatization and rapid data explosion have increased the demand for high-performance heterogeneous systems that integrate multiple computing cores such as CPUs, Graphics Processing Units (GPUs), Application Specific Integrated Circuits (ASICs), and Field Programmable Gate Arrays (FPGAs). The combination of CPU and GPU is particularly popular due to its versatility. However, these heterogeneous systems face significant security and privacy risks. Advances in privacy-preserving techniques, especially hardware-based Trusted Execution Environments (TEEs), offer effective protection for GPU applications. Nonetheless, the potential security risks involved in extending TEEs to GPUs in heterogeneous systems remain uncertain and need further investigation. To investigate these risks in depth, we study the existing popular GPU TEE designs and summarize and compare their key implications. Additionally, we review existing powerful attacks on GPUs and traditional TEEs deployed on CPUs, along with the efforts to mitigate these threats. We identify potential attack surfaces introduced by GPU TEEs and provide insights into key considerations for designing secure GPU TEEs. This survey is timely as new TEEs for heterogeneous systems, particularly GPUs, are being developed, highlighting the need to understand potential security threats and build both efficient and secure systems.

Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions

TL;DR

This paper surveys confidential computing on heterogeneous CPU-GPU systems, focusing on GPU TEEs, attack surfaces, and future directions. It surveys a spectrum of GPU-TEE designs across x86, Arm, and vendor-enabled platforms (e.g., NVIDIA H100), analyzes threat models, hardware modifications, TCB size, and overheads, and maps CPU TEE attacks to GPU contexts. The authors catalog architectural, microarchitectural, physical, and fault-injection attacks on GPUs, synthesize countermeasures, and discuss how design choices (e.g., memory encryption, VM-based isolation, or software-based verifications) influence security and practicality. The work emphasizes the need for architecture-aware defenses, rigorous analysis of GPU memory and interconnect vulnerabilities, and continued exploration of secure GPU design to enable secure high-performance computing in cloud and data-center contexts.

Abstract

In recent years, the widespread informatization and rapid data explosion have increased the demand for high-performance heterogeneous systems that integrate multiple computing cores such as CPUs, Graphics Processing Units (GPUs), Application Specific Integrated Circuits (ASICs), and Field Programmable Gate Arrays (FPGAs). The combination of CPU and GPU is particularly popular due to its versatility. However, these heterogeneous systems face significant security and privacy risks. Advances in privacy-preserving techniques, especially hardware-based Trusted Execution Environments (TEEs), offer effective protection for GPU applications. Nonetheless, the potential security risks involved in extending TEEs to GPUs in heterogeneous systems remain uncertain and need further investigation. To investigate these risks in depth, we study the existing popular GPU TEE designs and summarize and compare their key implications. Additionally, we review existing powerful attacks on GPUs and traditional TEEs deployed on CPUs, along with the efforts to mitigate these threats. We identify potential attack surfaces introduced by GPU TEEs and provide insights into key considerations for designing secure GPU TEEs. This survey is timely as new TEEs for heterogeneous systems, particularly GPUs, are being developed, highlighting the need to understand potential security threats and build both efficient and secure systems.
Paper Structure (68 sections, 8 figures, 3 tables)

This paper contains 68 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Current state of security research on heterogeneous computing systems with GPU (from the past 15 years). GPU Crypto refers to papers that protect the privacy of GPU-based applications using cryptographic techniques such as HE and MPC. GPU attacks refer to architectural (AA), microarchitectural side-channel/covert-channel (MSCA/MCCA), physical side-channel (PSCA), and software-based fault injection attacks (SFIA).
  • Figure 2: Timeline of significant events in secure heterogeneous computing systems.
  • Figure 3: Overview of confidential computing on heterogeneous CPU-GPU systems
  • Figure 4: Attacker models for inferring NN models on a single GPU.
  • Figure 5: Key steps of covert-channel attacks on a single GPU.
  • ...and 3 more figures