Table of Contents
Fetching ...

Everywhere & Nowhere: Envisioning a Computing Continuum for Science

Manish Parashar

TL;DR

This paper proposes a computing continuum that unifies edge, network, and HPC/cloud resources to enable end-to-end science workflows, supported by data-driven programming abstractions and autonomic middleware. It introduces approaches such as the Associative Rendezvous-based R-Pulsar system for data-driven computation, constrained autonomic federation for online resource discovery, CKAT-based data recommendations, and autonomic runtimes to manage execution with QoS across heterogeneous resources, while highlighting translational research and policy frameworks. Urgent computing is presented as a primary use case, illustrated by the COVID-19 HPC Consortium, wildfire and air-quality applications, and Early Earthquake Warning, with calls for scalable, resilient infrastructure and governance via initiatives like the NSCR. The work emphasizes the need to translate foundational advances into practice to improve time-to-science, resilience, and energy efficiency in real-world emergencies, leveraging flexible data precision (e.g., $8$, $16$, and $64$-bit formats) and seamless service-oriented deployment.

Abstract

Emerging data-driven scientific workflows are seeking to leverage distributed data sources to understand end-to-end phenomena, drive experimentation, and facilitate important decision-making. Despite the exponential growth of available digital data sources at the edge, and the ubiquity of non trivial computational power for processing this data, realizing such science workflows remains challenging. This paper explores a computing continuum that is everywhere and nowhere -- one spanning resources at the edges, in the core and in between, and providing abstractions that can be harnessed to support science. It also introduces recent research in programming abstractions that can express what data should be processed and when and where it should be processed, and autonomic middleware services that automate the discovery of resources and the orchestration of computations across these resources.

Everywhere & Nowhere: Envisioning a Computing Continuum for Science

TL;DR

This paper proposes a computing continuum that unifies edge, network, and HPC/cloud resources to enable end-to-end science workflows, supported by data-driven programming abstractions and autonomic middleware. It introduces approaches such as the Associative Rendezvous-based R-Pulsar system for data-driven computation, constrained autonomic federation for online resource discovery, CKAT-based data recommendations, and autonomic runtimes to manage execution with QoS across heterogeneous resources, while highlighting translational research and policy frameworks. Urgent computing is presented as a primary use case, illustrated by the COVID-19 HPC Consortium, wildfire and air-quality applications, and Early Earthquake Warning, with calls for scalable, resilient infrastructure and governance via initiatives like the NSCR. The work emphasizes the need to translate foundational advances into practice to improve time-to-science, resilience, and energy efficiency in real-world emergencies, leveraging flexible data precision (e.g., , , and -bit formats) and seamless service-oriented deployment.

Abstract

Emerging data-driven scientific workflows are seeking to leverage distributed data sources to understand end-to-end phenomena, drive experimentation, and facilitate important decision-making. Despite the exponential growth of available digital data sources at the edge, and the ubiquity of non trivial computational power for processing this data, realizing such science workflows remains challenging. This paper explores a computing continuum that is everywhere and nowhere -- one spanning resources at the edges, in the core and in between, and providing abstractions that can be harnessed to support science. It also introduces recent research in programming abstractions that can express what data should be processed and when and where it should be processed, and autonomic middleware services that automate the discovery of resources and the orchestration of computations across these resources.
Paper Structure (10 sections, 3 figures)

This paper contains 10 sections, 3 figures.

Figures (3)

  • Figure 1: A computing continuum across the evolving science ecosystem spanning large-scale instruments, experimental facilities, observatories, and sensor networks, all streaming data; high-speed networks and advanced network services; and a range of computing capabilities and capacities along the continuum, from edge, to in-network, to large-scale data centers.
  • Figure 2: The urgent computing workflow uses the computing continuum to process data from a range of data sources, along with other resources and services along the continuum, to detect events, develop a response, and trigger actions.
  • Figure 3: A call to action: We are witnessing urgent events with increasing frequency and increasing impacts, and the computing continuum along with urgent computing has the potential to help us understand, manage, and mitigate the impacts of these event. The HPC community has an opportunity to collectively leverage its expertise and the computing continuum to make a difference.