Table of Contents
Fetching ...

Final Report for CHESS: Cloud, High-Performance Computing, and Edge for Science and Security

Nathan Tallent, Jan Strube, Luanzheng Guo, Hyungro Lee, Jesun Firoz, Sayan Ghosh, Bo Fang, Oceane Bel, Steven Spurgeon, Sarah Akers, Christina Doty, Erol Cromwell

TL;DR

The results and successes of CHESS are described from the perspective of open science.

Abstract

Automating the theory-experiment cycle requires effective distributed workflows that utilize a computing continuum spanning lab instruments, edge sensors, computing resources at multiple facilities, data sets distributed across multiple information sources, and potentially cloud. Unfortunately, the obvious methods for constructing continuum platforms, orchestrating workflow tasks, and curating datasets over time fail to achieve scientific requirements for performance, energy, security, and reliability. Furthermore, achieving the best use of continuum resources depends upon the efficient composition and execution of workflow tasks, i.e., combinations of numerical solvers, data analytics, and machine learning. Pacific Northwest National Laboratory's LDRD "Cloud, High-Performance Computing (HPC), and Edge for Science and Security" (CHESS) has developed a set of interrelated capabilities for enabling distributed scientific workflows and curating datasets. This report describes the results and successes of CHESS from the perspective of open science.

Final Report for CHESS: Cloud, High-Performance Computing, and Edge for Science and Security

TL;DR

The results and successes of CHESS are described from the perspective of open science.

Abstract

Automating the theory-experiment cycle requires effective distributed workflows that utilize a computing continuum spanning lab instruments, edge sensors, computing resources at multiple facilities, data sets distributed across multiple information sources, and potentially cloud. Unfortunately, the obvious methods for constructing continuum platforms, orchestrating workflow tasks, and curating datasets over time fail to achieve scientific requirements for performance, energy, security, and reliability. Furthermore, achieving the best use of continuum resources depends upon the efficient composition and execution of workflow tasks, i.e., combinations of numerical solvers, data analytics, and machine learning. Pacific Northwest National Laboratory's LDRD "Cloud, High-Performance Computing (HPC), and Edge for Science and Security" (CHESS) has developed a set of interrelated capabilities for enabling distributed scientific workflows and curating datasets. This report describes the results and successes of CHESS from the perspective of open science.

Paper Structure

This paper contains 27 sections, 7 figures, 4 tables.

Figures (7)

  • Figure 1.1: Advancing collaborative computational science by automating distributed scientific systems on continuum platforms.
  • Figure 2.1: PNNL is building workflows that span the spectrum of compute environments and with strategies for moving data and AI workloads between them. CHESS tests these strategies against use-cases which require moving models between such environments.
  • Figure 2.2: CHESS's research targeted co-design distributed scientific systems for AI-enabled computational science.
  • Figure 2.3: CHESS's research challenge.
  • Figure 2.4: Overview of CHESS's science results, centered around (a) measurement, modeling, prediction that enables (b) co-design of continuum science consisting of data-driven methods, models and simulations, and continuum platforms. Additionally we explore (c) data management techniques with a view towards AI-aware data-driven science.
  • ...and 2 more figures