Just-in-Time Packet State Prefetching
Hamid Ghasemirahni, Alireza Farshin, Dejan Kostic, Marco Chiesa
TL;DR
The paper tackles the bottleneck of per-flow state in high-speed, CPU-based packet processing by proposing Nostradamus, a system to provide hints about upcoming packets to enable just-in-time prefetching of necessary state into caches. It demonstrates that careful timing and placement of prefetches can recover substantial throughput, with measurements showing up to $50\%$ improvements for a stateful L4 load balancer and reduced cache misses. The authors discuss the design space for providing hints (host vs network devices) and prefetching strategies (in-app vs NIC-assisted), outline challenges, and chart future directions across applications, hardware accelerators, and data structures. Overall, the work highlights a promising approach to bridge networking requirements and cache hierarchies, potentially enabling higher throughput at multi-hundred-Gbps rates.
Abstract
Could information about future incoming packets be used to build more efficient CPU-based packet processors? Can such information be obtained accurately? This paper studies novel packet processing architectures that receive external hints about which packets are soon to arrive, thus enabling prefetching into fast cache memories of the state needed to process them, just-in-time for the packets' arrival. We explore possible approaches to (i) obtain such hints either from network devices or the end hosts in the communication and (ii) use these hints to better utilize cache memories. We show that such information (if accurate) can improve packet processing throughput by at least 50%.
