Priority Matters: Optimising Kubernetes Clusters Usage with Constraint-Based Pod Packing
Henrik Daniel Christensen, Saverio Giallorenzo, Jacopo Mauro
TL;DR
This work addresses suboptimal pod placement in Kubernetes caused by heuristic schedulers that can waste resources and increase costs. It proposes a constraint-programming-based fallback using CP-SAT from OR-Tools, integrated as a Kubernetes plugin that optimizes allocations by priority within a fixed time budget, while preserving the default scheduler for feasible cases. The approach iterates over priorities to maximize deployed pods and minimize evictions, demonstrating improved high-priority pod placements in small-to-mid-sized clusters within 1–10 seconds, with diminishing returns as problem size grows. The results indicate practical potential to reduce resource fragmentation and operating costs in real-world deployments, while outlining avenues for extending solver diversity and cross-node preemption support.
Abstract
Distributed applications employ Kubernetes for scalable, fault-tolerant deployments over computer clusters, where application components run in groups of containers called pods. The scheduler, at the heart of Kubernetes' architecture, determines the placement of pods given their priority and resource requirements on cluster nodes. To quickly allocate pods, the scheduler uses lightweight heuristics that can lead to suboptimal placements and resource fragmentation, preventing allocations of otherwise deployable pods on the available nodes. We propose the usage of constraint programming to find the optimal allocation of pods satisfying all their priorities and resource requests. Implementation-wise, our solution comes as a plug-in to the default scheduler that operates as a fallback mechanism when some pods cannot be allocated. Using the OR-Tools constraint solver, our experiments on small-to-mid-sized clusters indicate that, within a 1-second scheduling window, our approach places more higher-priority pods than the default scheduler (possibly demonstrating allocation optimality) in over 44\% of realisable allocation scenarios where the default scheduler fails, while certifying that the default scheduler's placement is already optimal in over 19\% of scenarios. With a 10-second window, our approach improves placements in over 73\% and still certifies that the default scheduler's placement is already optimal in over 19\% of scenarios.
