Wilkins: HPC In Situ Workflows Made Easy
Orcun Yildiz, Dmitriy Morozov, Arnur Nigmetov, Bogdan Nicolae, Tom Peterka
TL;DR
Wilkins presents an in situ HPC workflow system designed for ease of use and scalability. It combines a data-centric YAML workflow description with a HighFive-based, HDF5-backed data transport to couple heterogeneous tasks without modifying user codes, and adds a flow-control mechanism to accommodate fluctuating data rates. The approach supports ensembles, various topologies, and custom actions via external Python callbacks, demonstrated by synthetic benchmarks and use cases in materials science and cosmology. Results show negligible overhead relative to standalone data transport, significant speedups from flow-control strategies, and scalable ensemble execution, highlighting Wilkins' practical impact for complex, data-intensive in situ workflows.
Abstract
In situ approaches can accelerate the pace of scientific discoveries by allowing scientists to perform data analysis at simulation time. Current in situ workflow systems, however, face challenges in handling the growing complexity and diverse computational requirements of scientific tasks. In this work, we present Wilkins, an in situ workflow system that is designed for ease-of-use while providing scalable and efficient execution of workflow tasks. Wilkins provides a flexible workflow description interface, employs a high-performance data transport layer based on HDF5, and supports tasks with disparate data rates by providing a flow control mechanism. Wilkins seamlessly couples scientific tasks that already use HDF5, without requiring task code modifications. We demonstrate the above features using both synthetic benchmarks and two science use cases in materials science and cosmology.
