sAirflow: Adopting Serverless in a Legacy Workflow Scheduler
Filip Mikina, Pawel Zuk, Krzysztof Rzadca
TL;DR
This work demonstrates how to migrate a legacy workflow system (Airflow) to a serverless architecture by using Change Data Capture to drive an event-based control plane and by deploying executors on Function-as-a-Service and Container-as-a-Service. The sAirflow design preserves Airflow interfaces while achieving rapid horizontal scaling and reduced monetary costs compared to MWAA. Empirical results over synthetic and Alibaba-trace DAGs show that sAirflow delivers strong scaling for parallel workloads and notable cost advantages, though CDC latency and container startup overhead introduce measurable overheads for certain workflows. The work highlights practical challenges in serverless migrations and identifies CDC and serverless database integration as key areas for future improvement.
Abstract
Serverless clouds promise efficient scaling, reduced toil and monetary costs. Yet, serverless-ing a complex, legacy application might require major refactoring and thus is risky. As a case study, we use Airflow, an industry-standard workflow system. To reduce migration risk, we propose to limit code modifications by relying on change data capture (CDC) and message queues for internal communication. To achieve serverless efficiency, we rely on Function-as-a-Service (FaaS). Our system, sAirflow, is the first adaptation of the control plane and workers to the serverless cloud - and it maintains the same interface and most of the code. Experimentally, we show that sAirflow delivers the key serverless benefits: scaling and cost reduction. We compare sAirflow to MWAA, a managed (SaaS) Airflow. On Alibaba benchmarks on warm systems, sAirflow performs similarly while halving the monetary cost. On highly parallel workflows on cold systems, sAirflow scales out in seconds to 125 workers, reducing makespan by 2x-7x.
