Cloud Resource Allocation with Convex Optimization
Shayan Boghani, Emin Kirimlioglu, Amrita Moturi, Hao-Ting Tso
TL;DR
The paper tackles Kubernetes Cluster Autoscaler's limitation of homogeneous node-pool scaling by formulating cloud-resource allocation as a convex optimization over $x \in \mathbb{R}^{n}_{+}$ with a matrix-based representation of resource demands and costs. It introduces a five-term objective $f(x)$, including a novel logarithmic indicator approximation to encourage limited provider fragmentation, and derives the dual via a Lagrangian to obtain the KKT conditions for global optimality, solved with interior-point methods after relaxing integrality. Key contributions include the logarithmic indicator, complete KKT framework, an Infrastructure Optimization Controller, and an incremental adoption protocol for production rollouts. Empirical results on real Azure/Linode data show substantial cost savings and improved resource utilization over traditional CA, especially in scenarios with memory-intensive workloads or tight node-pool constraints, validating the framework as a practical pathway to dynamic heterogeneous provisioning in Kubernetes.
Abstract
We present a convex optimization framework for overcoming the limitations of Kubernetes Cluster Autoscaler by intelligently allocating diverse cloud resources while minimizing costs and fragmentation. Current Kubernetes scaling mechanisms are restricted to homogeneous scaling of existing node types, limiting cost-performance optimization possibilities. Our matrix-based model captures resource demands, costs, and capacity constraints in a unified mathematical framework. A key contribution is our logarithmic approximation to the indicator function, which enables dynamic node type selection while maintaining problem convexity. Our approach balances cost optimization with operational complexity through interior-point methods. Experiments with real-world Kubernetes workloads demonstrate reduced costs and improved resource utilization compared to conventional Cluster Autoscaler strategies that can only scale up or down existing node pools.
