CPU-Limits kill Performance: Time to rethink Resource Control

Chirag Shetty; Sarthak Chakraborty; Hubertus Franke; Larisa Shwartz; Chandra Narayanaswami; Indranil Gupta; Saurabh Jha

CPU-Limits kill Performance: Time to rethink Resource Control

Chirag Shetty, Sarthak Chakraborty, Hubertus Franke, Larisa Shwartz, Chandra Narayanaswami, Indranil Gupta, Saurabh Jha

TL;DR

This paper challenges the conventional reliance on CPU-Limits ($c.lim$) for latency-sensitive cloud workloads, presenting empirical evidence that throttling degrades tail latency and can inflate costs. It argues that CPU-Requests ($c.req$) suffice to guarantee CPU share under CFS fairness, and that the standard limit-based control paradigm hurts performance and reliability. The authors propose a limit-free design with redesigned autoscalers and a performance-based billing model, exemplified by the Yet Another Autoscaler (YAAS) prototype, which achieves substantial resource savings and more predictable performance. They also provide a pragmatic view on when $c.lim$ might still be useful (e.g., background jobs) and offer a roadmap for practical deployment and further research into limit-free resource control.

Abstract

Research in compute resource management for cloud-native applications is dominated by the problem of setting optimal CPU limits -- a fundamental OS mechanism that strictly restricts a container's CPU usage to its specified CPU-limits . Rightsizing and autoscaling works have innovated on allocation/scaling policies assuming the ubiquity and necessity of CPU-limits . We question this. Practical experiences of cloud users indicate that CPU-limits harms application performance and costs more than it helps. These observations are in contradiction to the conventional wisdom presented in both academic research and industry best practices. We argue that this indiscriminate adoption of CPU-limits is driven by erroneous beliefs that CPU-limits is essential for operational and safety purposes. We provide empirical evidence making a case for eschewing CPU-limits completely from latency-sensitive applications. This prompts a fundamental rethinking of auto-scaling and billing paradigms and opens new research avenues. Finally, we highlight specific scenarios where CPU-limits can be beneficial if used in a well-reasoned way (e.g. background jobs).

CPU-Limits kill Performance: Time to rethink Resource Control

TL;DR

Abstract

CPU-Limits kill Performance: Time to rethink Resource Control

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)