Scalable Cosmic AI Inference using Cloud Serverless Computing
Mills Staylor, Amirreza Dolatpour Fathkouhi, Md Khairul Islam, Kaleigh O'Hara, Ryan Ghiles Goudjil, Geoffrey Fox, Judy Fox
TL;DR
This paper tackles the bottleneck of scalable, cost-effective inference on massive astronomical image datasets by introducing CAI, a Cloud-based Astronomy Inference framework that leverages AWS Lambda serverless computing to run large foundation-models (AstroMAE) for redshift prediction. CAI partitions data and executes parallel inferences, achieving near-linear scaling with dataset size and delivering substantial speedups (e.g., 28 s on 12.6 GB data) and high throughput (up to 18.04B bps) at costs under $5 per experiment. The authors validate CAI across devices including personal laptops, HPC clusters, and the cloud, and extend experiments to 1 TB data, demonstrating robust scalability and accessibility for the astronomy community. They also discuss limitations (e.g., Lambda memory and inter-function communication) and outline future work to integrate FMI and enhance high-performance communication between functions.
Abstract
Large-scale astronomical image data processing and prediction are essential for astronomers, providing crucial insights into celestial objects, the universe's history, and its evolution. While modern deep learning models offer high predictive accuracy, they often demand substantial computational resources, making them resource-intensive and limiting accessibility. We introduce the Cloud-based Astronomy Inference (CAI) framework to address these challenges. This scalable solution integrates pre-trained foundation models with serverless cloud infrastructure through a Function-as-a-Service (FaaS). CAI enables efficient and scalable inference on astronomical images without extensive hardware. Using a foundation model for redshift prediction as a case study, our extensive experiments cover user devices, HPC (High-Performance Computing) servers, and Cloud. Using redshift prediction with the AstroMAE model demonstrated CAI's scalability and efficiency, achieving inference on a 12.6 GB dataset in only 28 seconds compared to 140.8 seconds on HPC GPUs and 1793 seconds on HPC CPUs. CAI also achieved significantly higher throughput, reaching 18.04 billion bits per second (bps), and maintained near-constant inference times as data sizes increased, all at minimal computational cost (under $5 per experiment). We also process large-scale data up to 1 TB to show CAI's effectiveness at scale. CAI thus provides a highly scalable, accessible, and cost-effective inference solution for the astronomy community. The code is accessible at https://github.com/UVA-MLSys/AI-for-Astronomy.
