Table of Contents
Fetching ...

ONCache: A Cache-Based Low-Overhead Container Overlay Network

Shengkai Lin, Shizhen Zhao, Peirui Cao, Xinchi Han, Quan Tian, Wenfeng Liu, Qi Wu, Donghai Han, Xinbing Wang

TL;DR

ONCache addresses the performance gap between container overlay networks and bare-metal networking by exploiting an invariance property of overlay overhead and introducing a cross-layer, cache-based fast path. Implemented in 524 lines of eBPF code and integrated as a plugin for Antrea, ONCache maintains three per-host caches (egress, ingress, filter) to bypass repetitive processing while preserving flexibility and compatibility. Evaluation across microbenchmarks and real applications shows meaningful gains in throughput, latency (RR), and CPU efficiency, with results approaching bare-metal performance and demonstrating robustness across workloads like Memcached, PostgreSQL, and Nginx. Optional enhancements further improve performance, at the cost of kernel/protocol changes, and the approach is designed to be drop-in compatible with standard CNIs and service meshes, with open-source availability.

Abstract

Recent years have witnessed a widespread adoption of containers. While containers simplify and accelerate application development, existing container network technologies either incur significant overhead, which hurts performance for distributed applications, or lose flexibility or compatibility, which hinders the widespread deployment in production. We carefully analyze the kernel data path of an overlay network, quantifying the time consumed by each segment of the data path and identifying the \emph{extra overhead} in an overlay network compared to bare metal. We observe that this extra overhead generates repetitive results among packets, which inspires us to introduce caches within an overlay network. We design and implement ONCache (\textbf{O}verlay \textbf{N}etwork \textbf{Cache}), a cache-based container overlay network, to eliminate the extra overhead while maintaining flexibility and compatibility. We implement ONCache using the extended Berkeley Packet Filter (eBPF) with only 524 lines of code, and integrate it as a plugin of Antrea. With ONCache, containers attain networking performance akin to that of bare metal. Compared to the standard overlay networks, ONCache improves throughput and request-response transaction rate by 12\% and 36\% for TCP (20\% and 34\% for UDP), respectively, while significantly reducing per-packet CPU overhead. Popular distributed applications also benefit from ONCache.

ONCache: A Cache-Based Low-Overhead Container Overlay Network

TL;DR

ONCache addresses the performance gap between container overlay networks and bare-metal networking by exploiting an invariance property of overlay overhead and introducing a cross-layer, cache-based fast path. Implemented in 524 lines of eBPF code and integrated as a plugin for Antrea, ONCache maintains three per-host caches (egress, ingress, filter) to bypass repetitive processing while preserving flexibility and compatibility. Evaluation across microbenchmarks and real applications shows meaningful gains in throughput, latency (RR), and CPU efficiency, with results approaching bare-metal performance and demonstrating robustness across workloads like Memcached, PostgreSQL, and Nginx. Optional enhancements further improve performance, at the cost of kernel/protocol changes, and the approach is designed to be drop-in compatible with standard CNIs and service meshes, with open-source availability.

Abstract

Recent years have witnessed a widespread adoption of containers. While containers simplify and accelerate application development, existing container network technologies either incur significant overhead, which hurts performance for distributed applications, or lose flexibility or compatibility, which hinders the widespread deployment in production. We carefully analyze the kernel data path of an overlay network, quantifying the time consumed by each segment of the data path and identifying the \emph{extra overhead} in an overlay network compared to bare metal. We observe that this extra overhead generates repetitive results among packets, which inspires us to introduce caches within an overlay network. We design and implement ONCache (\textbf{O}verlay \textbf{N}etwork \textbf{Cache}), a cache-based container overlay network, to eliminate the extra overhead while maintaining flexibility and compatibility. We implement ONCache using the extended Berkeley Packet Filter (eBPF) with only 524 lines of code, and integrate it as a plugin of Antrea. With ONCache, containers attain networking performance akin to that of bare metal. Compared to the standard overlay networks, ONCache improves throughput and request-response transaction rate by 12\% and 36\% for TCP (20\% and 34\% for UDP), respectively, while significantly reducing per-packet CPU overhead. Popular distributed applications also benefit from ONCache.
Paper Structure (37 sections, 11 figures, 4 tables)

This paper contains 37 sections, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Architecture of ONCache. ONCache consists of 4 eBPF programs in the data path, 3 eBPF maps, and 1 user space program. The components with the eBPF logo ("bee") represent eBPF programs or eBPF maps. dIP is short for destination IP address.
  • Figure 2: The cache initialization process of ONCache. EI-Prog/II-Prog update the entries (green shaded) when the initialization requirements are met.
  • Figure 3: Journey of an overlay packet in ONCache. EI-Prog/II-Prog are skiped over. The dash lines denotes the redirect path.
  • Figure 4: The fast path in (a) ONCache without rpeer; (b) ONCache with rpeer.
  • Figure 5: TCP and UDP microbenchmark results of bare metal, Slim (only supports TCP), Falcon (Linux kernel v5.4), ONCache, Antrea and Cilium. Both Cilium and Antrea provide standard overlay networks. All data is the average of a single flow. CPU utilization is measured on the receiver host, normalized by throughput or RR, and scaled to Antrea's throughput or RR.
  • ...and 6 more figures