The Landscape and Challenges of HPC Research and LLMs
Le Chen, Nesreen K. Ahmed, Akash Dutta, Arijit Bhattacharjee, Sixing Yu, Quazi Ishtiaque Mahmud, Waqwoya Abebe, Hung Phan, Aishwarya Sarkar, Branden Butler, Niranjan Hasabnis, Gal Oren, Vy A. Vo, Juan Pablo Munoz, Theodore L. Willke, Tim Mattson, Ali Jannesari
TL;DR
The paper investigates the potential of applying large language models (LLMs) to high-performance computing (HPC) tasks, framing a landscape of opportunities and challenges. It surveys pathways including code representations (notably IR-based), multimodal fusion with runtime data, parallel code generation, and natural language programming tailored to HPC, supported by a review of current code LLMs. The authors identify critical gaps in data, representations, and evaluation while presenting a case study of mutual benefits between LLMs and HPC and discussing how HPC can accelerate LLM training and inference. The work underscores practical implications for HPC performance optimization, development efficiency, and industry adoption, highlighting the need for collaboration across LLM and HPC communities to realize scalable, reliable HPC-LLM systems.
Abstract
Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breaching exascale performance levels. In this paper, we posit that adapting and utilizing such language model-based techniques for tasks in high-performance computing (HPC) would be very beneficial. This study presents our reasoning behind the aforementioned position and highlights how existing ideas can be improved and adapted for HPC tasks.
