CPU-Limits kill Performance: Time to rethink Resource Control
Chirag Shetty, Sarthak Chakraborty, Hubertus Franke, Larisa Shwartz, Chandra Narayanaswami, Indranil Gupta, Saurabh Jha
TL;DR
This paper challenges the conventional reliance on CPU-Limits ($c.lim$) for latency-sensitive cloud workloads, presenting empirical evidence that throttling degrades tail latency and can inflate costs. It argues that CPU-Requests ($c.req$) suffice to guarantee CPU share under CFS fairness, and that the standard limit-based control paradigm hurts performance and reliability. The authors propose a limit-free design with redesigned autoscalers and a performance-based billing model, exemplified by the Yet Another Autoscaler (YAAS) prototype, which achieves substantial resource savings and more predictable performance. They also provide a pragmatic view on when $c.lim$ might still be useful (e.g., background jobs) and offer a roadmap for practical deployment and further research into limit-free resource control.
Abstract
Research in compute resource management for cloud-native applications is dominated by the problem of setting optimal CPU limits -- a fundamental OS mechanism that strictly restricts a container's CPU usage to its specified CPU-limits . Rightsizing and autoscaling works have innovated on allocation/scaling policies assuming the ubiquity and necessity of CPU-limits . We question this. Practical experiences of cloud users indicate that CPU-limits harms application performance and costs more than it helps. These observations are in contradiction to the conventional wisdom presented in both academic research and industry best practices. We argue that this indiscriminate adoption of CPU-limits is driven by erroneous beliefs that CPU-limits is essential for operational and safety purposes. We provide empirical evidence making a case for eschewing CPU-limits completely from latency-sensitive applications. This prompts a fundamental rethinking of auto-scaling and billing paradigms and opens new research avenues. Finally, we highlight specific scenarios where CPU-limits can be beneficial if used in a well-reasoned way (e.g. background jobs).
