Table of Contents
Fetching ...

Prime+Retouch: When Cache is Locked and Leaked

Jaehyuk Lee, Fan Sang, Taesoo Kim

TL;DR

Prime+Retouch introduces a metadata-based cache side-channel attack that bypasses traditional detection and prefetch-locking defenses by exploiting the Tree-PLRU eviction policy to infer victim access patterns without evictions. The authors demonstrate the attack on Intel x86 and Apple M1, including leakage of AES T-Table accesses under Cloak and SGX and in environments with coarse timers. They reverse-engineer Tree-PLRU on both platforms (using TSX on Intel and undocumented PMUs on M1) and develop PLRU Aware Retouch to achieve precise leakage despite synchronization challenges. The work underscores that hardware-level cache replacement policies, not just eviction events, can enable leakage and motivates revisiting defense strategies for L1 caches across architectures like x86 and Apple silicon.

Abstract

Caches on the modern commodity CPUs have become one of the major sources of side-channel leakages and been abused as a new attack vector. To thwart the cache-based side-channel attacks, two types of countermeasures have been proposed: detection-based ones that limit the amount of microarchitectural traces an attacker can leave, and cache prefetching-and-locking techniques that claim to prevent such leakage by disallowing evictions on sensitive data. In this paper, we present the Prime+Retouch attack that completely bypasses these defense schemes by accurately inferring the cache activities with the metadata of the cache replacement policy. Prime+Retouch has three noticeable properties: 1) it incurs no eviction on the victim's data, allowing us to bypass the two known mitigation schemes, 2) it requires minimal synchronization of only one memory access to the attacker's pre-primed cache lines, and 3) it leaks data via non-shared memory, yet because underlying eviction metadata is shared. We demonstrate Prime+Retouch in two architectures: predominant Intel x86 and emerging Apple M1. We elucidate how Prime+Retouch can break the T-table implementation of AES with robust cache side-channel mitigations such as Cloak, under both normal and SGX-protected environments. We also manifest feasibility of the Prime+Retouch attack on the M1 platform imposing more restrictions where the precise measurement tools such as core clock cycle timer and performance counters are inaccessible to the attacker. Furthermore, we first demystify undisclosed cache architecture and its eviction policy of L1 data cache on Apple M1 architecture. We also devise a user-space noise-free cache monitoring tool by repurposing Intel TSX.

Prime+Retouch: When Cache is Locked and Leaked

TL;DR

Prime+Retouch introduces a metadata-based cache side-channel attack that bypasses traditional detection and prefetch-locking defenses by exploiting the Tree-PLRU eviction policy to infer victim access patterns without evictions. The authors demonstrate the attack on Intel x86 and Apple M1, including leakage of AES T-Table accesses under Cloak and SGX and in environments with coarse timers. They reverse-engineer Tree-PLRU on both platforms (using TSX on Intel and undocumented PMUs on M1) and develop PLRU Aware Retouch to achieve precise leakage despite synchronization challenges. The work underscores that hardware-level cache replacement policies, not just eviction events, can enable leakage and motivates revisiting defense strategies for L1 caches across architectures like x86 and Apple silicon.

Abstract

Caches on the modern commodity CPUs have become one of the major sources of side-channel leakages and been abused as a new attack vector. To thwart the cache-based side-channel attacks, two types of countermeasures have been proposed: detection-based ones that limit the amount of microarchitectural traces an attacker can leave, and cache prefetching-and-locking techniques that claim to prevent such leakage by disallowing evictions on sensitive data. In this paper, we present the Prime+Retouch attack that completely bypasses these defense schemes by accurately inferring the cache activities with the metadata of the cache replacement policy. Prime+Retouch has three noticeable properties: 1) it incurs no eviction on the victim's data, allowing us to bypass the two known mitigation schemes, 2) it requires minimal synchronization of only one memory access to the attacker's pre-primed cache lines, and 3) it leaks data via non-shared memory, yet because underlying eviction metadata is shared. We demonstrate Prime+Retouch in two architectures: predominant Intel x86 and emerging Apple M1. We elucidate how Prime+Retouch can break the T-table implementation of AES with robust cache side-channel mitigations such as Cloak, under both normal and SGX-protected environments. We also manifest feasibility of the Prime+Retouch attack on the M1 platform imposing more restrictions where the precise measurement tools such as core clock cycle timer and performance counters are inaccessible to the attacker. Furthermore, we first demystify undisclosed cache architecture and its eviction policy of L1 data cache on Apple M1 architecture. We also devise a user-space noise-free cache monitoring tool by repurposing Intel TSX.
Paper Structure (28 sections, 3 equations, 13 figures, 2 tables, 1 algorithm)

This paper contains 28 sections, 3 equations, 13 figures, 2 tables, 1 algorithm.

Figures (13)

  • Figure 1: Tree-PLRU in action. It illustrates how the internal metadata of Tree-PLRU changes on a cache miss and hit on a cache line of the same associative set.
  • Figure 2: L1 hit and miss latency for Icestorm and Firestorm core. The measured latency includes the memory barrier instruction's latency: 51 cycles and 56 cycles, respectively, on Icestorm and Firestorm core. Therefore, on two different cores, the actual L1 hit latency is 2 cycles and 4 cycles, and L1 miss latency is around 12 cycles and 16 cycles, respectively. Note that each core runs at different clock frequency.
  • Figure 3: An overview of Prime+Retouch against the L1 cache with the Tree-PLRU policy. Changes on Tree-PLRU are demonstrated following the timeline. Depending on the presence of the victim's access following the prefetching, two different PLRU cache ways will be produced.
  • Figure 4: When Retouchis applied, the identical Tree-PLRU metadata is produced as a result of two different operation sequences: synchronized (left) and unsynchronized (right).
  • Figure 5: When PLRU Aware Retouch (\ref{['ss:post-sync']}) is applied to the case in \ref{['f:false-positive']}, by probing the PLRU cache way represented in the final tree states, the attacker is able to distinguish the two operation sequences.
  • ...and 8 more figures