Meili: Enabling SmartNIC as a Service in the Cloud
Qiang Su, Shaofeng Wu, Zhixiong Niu, Ran Shu, Peng Cheng, Yongqiang Xiong, Zaoxing Liu, Hong Xu
TL;DR
Meili tackles the inefficiency of current SmartNIC usage in data centers by pooling heterogeneous NIC resources and exposing a unified one-NIC abstraction for application developers. It introduces a modular programming model with user-customized functions, a data plane with partial pipeline replication across NICs, and a locality-aware control plane that allocates resources and adapts to changing demands. The system demonstrates up to 1.75x improvements in cluster-wide resource efficiency, scalable throughput with low latency overhead, and better resource availability, across NVIDIA BlueField and AMD Pensando NICs. This approach enables practical SmartNIC as a Service in the cloud, offering significant host CPU savings and improved elasticity for networked workloads such as UPFs in 5G and security functions in data centers.
Abstract
SmartNICs are touted as an attractive substrate for network application offloading, offering benefits in programmability, host resource saving, and energy efficiency. The current usage restricts offloading to local hosts and confines SmartNIC ownership to individual application teams, resulting in poor resource efficiency and scalability. This paper presents Meili, a novel system that realizes SmartNIC as a service to address these issues. Meili organizes heterogeneous SmartNIC resources as a pool and offers a unified one-NIC abstraction to application developers. This allows developers to focus solely on the application logic while dynamically optimizing their performance needs. Our evaluation on NVIDIA BlueField series and AMD Pensando SmartNICs demonstrates that Meili achieves scalable single-flow throughput with a maximum 8 μs latency overhead and enhances resource efficiency by 3.07$\times$ compared to standalone deployments and 1.44$\times$ compared to state-of-the-art microservice deployments.
