The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches

Chuqi Zhang; Jun Zeng; Yiming Zhang; Adil Ahmad; Fengwei Zhang; Hai Jin; Zhenkai Liang

The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches

Chuqi Zhang, Jun Zeng, Yiming Zhang, Adil Ahmad, Fengwei Zhang, Hai Jin, Zhenkai Liang

TL;DR

HitchHiker tackles the problem of protecting observability logs from compromised operating systems by delivering an in-memory protection system with real-time, configurable deadlines. It achieves this through hardware permission switching (GPT/S2PT) within a minimal two-domain design (HkM/HkD), a debloated storage driver, and a protected log management daemon, enabling secure remote retrieval. The approach yields a dramatic reduction in protection latency ($93.3-99.3\%$ shorter windows) and a substantially smaller TCB ($9.4-26.9\times$), while incurring near-native performance on real workloads (average overheads around $1-10\%$ depending on configuration and workload). These results indicate strong practical impact for secure forensics and incident response in enterprise environments, with a flexible deployment model across ARM-based systems and potential extension to other architectures.

Abstract

Protecting system observability records (logs) from compromised OSs has gained significant traction in recent times, with several note-worthy approaches proposed. Unfortunately, none of the proposed approaches achieve high performance with tiny log protection delays. They also leverage risky environments for protection (\eg many use general-purpose hypervisors or TrustZone, which have large TCB and attack surfaces). HitchHiker is an attempt to rectify this problem. The system is designed to ensure (a) in-memory protection of batched logs within a short and configurable real-time deadline by efficient hardware permission switching, and (b) an end-to-end high-assurance environment built upon hardware protection primitives with debloating strategies for secure log protection, persistence, and management. Security evaluations and validations show that HitchHiker reduces log protection delay by 93.3--99.3% compared to the state-of-the-art, while reducing TCB by 9.4--26.9X. Performance evaluations show HitchHiker incurs a geometric mean of less than 6% overhead on diverse real-world programs, improving on the state-of-the-art approach by 61.9--77.5%.

The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches

TL;DR

shorter windows) and a substantially smaller TCB (

), while incurring near-native performance on real workloads (average overheads around

depending on configuration and workload). These results indicate strong practical impact for secure forensics and incident response in enterprise environments, with a flexible deployment model across ARM-based systems and potential extension to other architectures.

Abstract

Paper Structure (34 sections, 5 figures, 7 tables)

This paper contains 34 sections, 5 figures, 7 tables.

Background
System Observability
Threat Model and Assumptions
Motivation
Observability Protection
Limitations of Current Protection Solutions
HitchHiker Overview
High Assurance Log Environment Design
Deadline-Enforced Log Permission Switch
System Deployment Model
HitchHiker Design
Secure Environment Software Stack
Protection domain bootstrap and enforcement
Debloated driver synthesis and protection.
Protected userspace daemon execution
...and 19 more sections

Figures (5)

Figure 1: Overviews of (a) conventional observability systems and (b-d) observability protection systems ( : trusted components).
Figure 2: HitchHiker memory layout and permission views (left) and workflow (right). The OS and applications () are inside the OS domain (UT). HitchHikerMonitor ( HkM) and Daemon ( HkD) are inside the protected domain (TR). Memory views of two domains are maintained by their separated hardware primitive configurations, respectively (i.e., two S2PTs/GPTs, \ref{['subsec:design_security_monitor']}).
Figure 3: CDF shows the relation between different protection timer $T_p$ ($1ms$, $500\mu s$, and $100\mu s$) deadline settings and HitchHiker's actual in-memory protection window (delay).
Figure 4: Overhead of HitchHiker on LMBench larry1996lmbench.
Figure 5: Real-world workload overheads. From left to right, the log throughput for each program is: 1907, 589, 32358, 7234, 31434; 10999, 78776, 78537, 93928, 101962 logs/sec.

The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches

TL;DR

Abstract

The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches

Authors

TL;DR

Abstract

Table of Contents

Figures (5)