SecScale: A Scalable and Secure Trusted Execution Environment for Servers
Ani Sunny, Nivedita Shrivastava, Smruti R. Sarangi
TL;DR
SecScale addresses the scalability gap in trusted execution environments by combining a read-first, verify-later execution model with a novel MAC forest for integrity and full memory encryption for confidentiality. It replaces the traditional Merkle-tree approach with a hierarchical MAC forest, enabling scalable integrity checks for large secure memories (eEPC) while keeping the critical path non-blocking through concurrent MAC verification. The design includes an evaporable page-transfer mechanism, an Eviction Status Holding Register, and targeted optimizations that reduce MAC accesses and DRAM traffic. Evaluation against SGX-Client, DFP, and Penglai shows SecScale achieving up to 57% performance improvement over DFP and 10% over Penglai, with substantial storage efficiency and preserved ACIF security properties for large-scale server workloads.
Abstract
Trusted execution environments (TEEs) are an integral part of modern secure processors. They ensure that their application and code pages are confidential, tamper proof and immune to diverse types of attacks. In 2021, Intel suddenly announced its plans to deprecate its most trustworthy enclave, SGX, on its 11th and 12th generation processors. The reasons stemmed from the fact that it was difficult to scale the enclaves (sandboxes) beyond 256 MB as the hardware overheads outweighed the benefits. Competing solutions by Intel and other vendors are much more scalable, but do not provide many key security guarantees that SGX used to provide notably replay attack protection. In the last three years, no proposal from industry or academia has been able to provide both scalability (with a modest slowdown) as well as replay-protection on generic hardware (to the best of our knowledge). We solve this problem by proposing SecScale that uses some new ideas centered around speculative execution (read first, verify later), creating a forest of MACs (instead of a tree of counters) and providing complete memory encryption (no generic unsecure regions). We show that we are 10% faster than the nearest competing alternative.
