NeuroScaler: Towards Energy-Optimal Autoscaling for Container-Based Services
Alisson O. Chaves, Rodrigo Moreira, Larissa F. Rodrigues Moreira, Joao Correia, David Santos, Rui Silva, Tiago Barros, Daniel Corujo, Miguel Rocha, Flavio de Oliveira Silva
TL;DR
This paper addresses the energy and carbon footprint of autoscaling in multi-domain networks by introducing NeuroScaler, an AI-native orchestrator built on the NEURONET platform. It integrates multi-tier telemetry from PDU to containers and uses a model-predictive control autoscaler to proactively optimize energy consumption while meeting SLOs. The key contributions are an architecture for cross-domain green orchestration, an MPC-based autoscaler that internalizes energy and carbon objectives, and real-world validation showing a 34.68% energy reduction without sacrificing latency. This work demonstrates that energy can be treated as a first-class objective in telco cloud and edge operations, enabling greener data centers and networks with practical impact.
Abstract
Future networks must meet stringent requirements while operating within tight energy and carbon constraints. Current autoscaling mechanisms remain workload-centric and infrastructure-siloed, and are largely unaware of their environmental impact. We present NeuroScaler, an AI-native, energy-efficient, and carbon-aware orchestrator for green cloud and edge networks. NeuroScaler aggregates multi-tier telemetry, from Power Distribution Units (PDUs) through bare-metal servers to virtualized infrastructure with containers managed by Kubernetes, using distinct energy and computing metrics at each tier. It supports several machine learning pipelines that link load, performance, and power. Within this unified observability layer, a model-predictive control policy optimizes energy use while meeting service-level objectives. In a real testbed with production-grade servers supporting real services, NeuroScaler reduces energy consumption by 34.68% compared to the Horizontal Pod Autoscaler (HPA) while maintaining target latency.
