The NIC should be part of the OS
Pengcheng Xu, Timothy Roscoe
TL;DR
This work reframes the NIC as a trusted, OS-integrated component rather than a detached device or a kernel-bypass agent. By leveraging cache-coherent interconnects and OS-state sharing, the proposed Lauberhorn prototype enables near-zero-overhead RPC dispatch and dynamic core scheduling for RPC-centric workloads, potentially outperforming traditional kernel-bypass stacks while maintaining flexibility for dynamic workloads. The approach challenges long-standing NIC/OS division principles, offering a concrete OS-centric architecture, a detailed receive-path design, and scheduling-state protocols, with formal verification and practical non-functional concerns identified as future work. If broadly feasible, this could reduce CPU cycles per RPC and improve energy efficiency for cloud microservices and serverless functions by tightly coupling NIC processing with OS scheduling and state awareness.
Abstract
The network interface adapter (NIC) is a critical component of a cloud server occupying a unique position. Not only is network performance vital to efficient operation of the machine, but unlike compute accelerators like GPUs, the network subsystem must react to unpredictable events like the arrival of a network packet and communicate with the appropriate application end point with minimal latency. Current approaches to server stacks navigate a trade-off between flexibility, efficiency, and performance: the fastest kernel-bypass approaches dedicate cores to applications, busy-wait on receive queues, etc. while more flexible approaches appropriate to more dynamic workload mixes incur much greater software overhead on the data path. However, we reject this trade-off, which we ascribe to an arbitrary (and sub-optimal) split in system state between the OS and the NIC. Instead, by exploiting the properties of cache-coherent interconnects and integrating the NIC closely with the OS kernel, we can achieve something surprising: performance for RPC workloads better than the fastest kernelbypass approaches without sacrificing the robustness and dynamic adaptation of kernel-based network subsystems.
