Zenix: Efficient Execution of Bulky Serverless Applications
Zhiyuan Guo, Zachary Blanco, Junda Chen, Jinmou Li, Zerui Wei, Bili Dong, Ishaan Pota, Mohammad Shahrad, Harry Xu, Yiying Zhang
TL;DR
BulkX addresses the inefficiencies of running bulky workloads in traditional function-centric serverless platforms by introducing a resource-centric model that adaptively allocates CPU and memory at the invocation level. It relies on offline profiling and a resource graph to capture fine-grained resource features, enabling proactive and reactive scheduling, memory placement, and remote-access optimizations. The system employs a two-level global-rack scheduler, a memory-aware executor, RDMA/TCP data paths, and history-based resource adjustment to achieve substantial resource savings (up to 90%) and significant performance gains (up to 64%) across data analytics, video processing, and ML tasks. This work demonstrates that bulky applications can run efficiently in serverless environments, reducing waste and improving responsiveness, with public availability planned.
Abstract
Serverless computing, commonly offered as Function-as-a-Service, was initially designed for small, lean applications. However, there has been an increasing desire to run larger, more complex applications (what we call bulky applications) in a serverless manner. Existing strategies for enabling such applications are to either increase function sizes or to rewrite applications as DAGs of functions. These approaches cause significant resource wastage, manual efforts, and/or performance overhead. We argue that the root cause of these issues is today's function-centric serverless model, where a function is the resource allocation and scaling unit. We propose a new, resource-centric serverless-computing model for executing bulky applications in a resource- and performance-efficient way, and we build the Zenix serverless platform following this model. Our results show that Zenix reduces resource consumption by up to 90% compared to today's function-centric serverless systems, while improving performance by up to 64%.
