Connecting Large Language Model Agent to High Performance Computing Resource
Heng Ma, Alexander Brace, Carlo Siebenschuh, Greg Pauloski, Ian Foster, Arvind Ramanathan
TL;DR
The paper addresses the challenge of enabling Large Language Model (LLM) agent workflows to harness high-performance computing (HPC) resources for computationally intensive scientific tasks. It introduces Parsl as a bridge between LangGraph/LangChain-based LLM tool calls and HPC execution, and evaluates two integration schemes: a Parsl tool node that parallelizes individual tool calls and a Parsl ensemble function that runs a reservoir of simulations. Through molecular dynamics (MD) workflows using OpenMM and data retrieval via PDB, the study demonstrates concurrent execution on both a local GPU workstation and Polaris/ALCF, highlighting performance, scalability, and queue-time considerations on HPC systems. The findings show that while Parsl enables efficient parallelism and reduces developer burden, HPC constraints and LLM tool-call limits require careful architectural choices, such as running expensive tasks as ensemble functions. This work provides a practical pathway for AI-driven scientific workflows to leverage HPC resources effectively, informing design decisions for task orchestration, tool function design, and future extensions like AI-assisted sampling and broader HPC platform support.
Abstract
The Large Language Model agent workflow enables the LLM to invoke tool functions to increase the performance on specific scientific domain questions. To tackle large scale of scientific research, it requires access to computing resource and parallel computing setup. In this work, we implemented Parsl to the LangChain/LangGraph tool call setup, to bridge the gap between the LLM agent to the computing resource. Two tool call implementations were set up and tested on both local workstation and HPC environment on Polaris/ALCF. The first implementation with Parsl-enabled LangChain tool node queues the tool functions concurrently to the Parsl workers for parallel execution. The second configuration is implemented by converting the tool functions into Parsl ensemble functions, and is more suitable for large task on super computer environment. The LLM agent workflow was prompted to run molecular dynamics simulations, with different protein structure and simulation conditions. These results showed the LLM agent tools were managed and executed concurrently by Parsl on the available computing resource.
