Table of Contents
Fetching ...

The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches

Chuqi Zhang, Jun Zeng, Yiming Zhang, Adil Ahmad, Fengwei Zhang, Hai Jin, Zhenkai Liang

TL;DR

HitchHiker tackles the problem of protecting observability logs from compromised operating systems by delivering an in-memory protection system with real-time, configurable deadlines. It achieves this through hardware permission switching (GPT/S2PT) within a minimal two-domain design (HkM/HkD), a debloated storage driver, and a protected log management daemon, enabling secure remote retrieval. The approach yields a dramatic reduction in protection latency ($93.3-99.3\%$ shorter windows) and a substantially smaller TCB ($9.4-26.9\times$), while incurring near-native performance on real workloads (average overheads around $1-10\%$ depending on configuration and workload). These results indicate strong practical impact for secure forensics and incident response in enterprise environments, with a flexible deployment model across ARM-based systems and potential extension to other architectures.

Abstract

Protecting system observability records (logs) from compromised OSs has gained significant traction in recent times, with several note-worthy approaches proposed. Unfortunately, none of the proposed approaches achieve high performance with tiny log protection delays. They also leverage risky environments for protection (\eg many use general-purpose hypervisors or TrustZone, which have large TCB and attack surfaces). HitchHiker is an attempt to rectify this problem. The system is designed to ensure (a) in-memory protection of batched logs within a short and configurable real-time deadline by efficient hardware permission switching, and (b) an end-to-end high-assurance environment built upon hardware protection primitives with debloating strategies for secure log protection, persistence, and management. Security evaluations and validations show that HitchHiker reduces log protection delay by 93.3--99.3% compared to the state-of-the-art, while reducing TCB by 9.4--26.9X. Performance evaluations show HitchHiker incurs a geometric mean of less than 6% overhead on diverse real-world programs, improving on the state-of-the-art approach by 61.9--77.5%.

The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches

TL;DR

HitchHiker tackles the problem of protecting observability logs from compromised operating systems by delivering an in-memory protection system with real-time, configurable deadlines. It achieves this through hardware permission switching (GPT/S2PT) within a minimal two-domain design (HkM/HkD), a debloated storage driver, and a protected log management daemon, enabling secure remote retrieval. The approach yields a dramatic reduction in protection latency ( shorter windows) and a substantially smaller TCB (), while incurring near-native performance on real workloads (average overheads around depending on configuration and workload). These results indicate strong practical impact for secure forensics and incident response in enterprise environments, with a flexible deployment model across ARM-based systems and potential extension to other architectures.

Abstract

Protecting system observability records (logs) from compromised OSs has gained significant traction in recent times, with several note-worthy approaches proposed. Unfortunately, none of the proposed approaches achieve high performance with tiny log protection delays. They also leverage risky environments for protection (\eg many use general-purpose hypervisors or TrustZone, which have large TCB and attack surfaces). HitchHiker is an attempt to rectify this problem. The system is designed to ensure (a) in-memory protection of batched logs within a short and configurable real-time deadline by efficient hardware permission switching, and (b) an end-to-end high-assurance environment built upon hardware protection primitives with debloating strategies for secure log protection, persistence, and management. Security evaluations and validations show that HitchHiker reduces log protection delay by 93.3--99.3% compared to the state-of-the-art, while reducing TCB by 9.4--26.9X. Performance evaluations show HitchHiker incurs a geometric mean of less than 6% overhead on diverse real-world programs, improving on the state-of-the-art approach by 61.9--77.5%.
Paper Structure (34 sections, 5 figures, 7 tables)

This paper contains 34 sections, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Overviews of (a) conventional observability systems and (b-d) observability protection systems ( : trusted components).
  • Figure 2: HitchHiker memory layout and permission views (left) and workflow (right). The OS and applications () are inside the OS domain (UT). HitchHikerMonitor ( HkM) and Daemon ( HkD) are inside the protected domain (TR). Memory views of two domains are maintained by their separated hardware primitive configurations, respectively (i.e., two S2PTs/GPTs, \ref{['subsec:design_security_monitor']}).
  • Figure 3: CDF shows the relation between different protection timer $T_p$ ($1ms$, $500\mu s$, and $100\mu s$) deadline settings and HitchHiker's actual in-memory protection window (delay).
  • Figure 4: Overhead of HitchHiker on LMBench larry1996lmbench.
  • Figure 5: Real-world workload overheads. From left to right, the log throughput for each program is: 1907, 589, 32358, 7234, 31434; 10999, 78776, 78537, 93928, 101962 logs/sec.