Table of Contents
Fetching ...

The GA4GH Task Execution API: Enabling Easy Multi Cloud Task Execution

Alexander Kanitz, Matthew H. McLoughlin, Liam Beckman, Venkat S. Malladi, Kyle P. Ellrott

TL;DR

The paper addresses the fragmentation of batch-task execution interfaces in genomics workflows. It proposes the GA4GH TES API, an OpenAPI-based standard that encodes tasks as atomic messages with defined environment, resources, I/O, and sequential executors to run across HPC/HTC, cloud, and hybrid systems. It describes ecosystem components: service providers, middleware, workflow engines, and conformance tests to ensure interoperability. This standard enables data-local compute and multi-cloud federation, reducing integration effort and enabling scalable, portable genomic analyses.

Abstract

The Global Alliance for Genomics and Health (GA4GH) Task Execution Service (TES) API is a standardized schema and API for describing and executing batch execution tasks. It provides a common way to submit and manage tasks to a variety of compute environments, including on premise High Performance Compute and High Throughput Computing (HPC/HTC) systems, Cloud computing platforms, and hybrid environments. The TES API is designed to be flexible and extensible, allowing it to be adapted to a wide range of use cases, such as "bringing compute to the data" solutions for federated and distributed data analysis or load balancing across multi cloud infrastructures. This API has been adopted by a number of different service providers and utilized by several workflow engines. Using its capabilities, genomes research institutes are building hybrid compute systems to study life science.

The GA4GH Task Execution API: Enabling Easy Multi Cloud Task Execution

TL;DR

The paper addresses the fragmentation of batch-task execution interfaces in genomics workflows. It proposes the GA4GH TES API, an OpenAPI-based standard that encodes tasks as atomic messages with defined environment, resources, I/O, and sequential executors to run across HPC/HTC, cloud, and hybrid systems. It describes ecosystem components: service providers, middleware, workflow engines, and conformance tests to ensure interoperability. This standard enables data-local compute and multi-cloud federation, reducing integration effort and enabling scalable, portable genomic analyses.

Abstract

The Global Alliance for Genomics and Health (GA4GH) Task Execution Service (TES) API is a standardized schema and API for describing and executing batch execution tasks. It provides a common way to submit and manage tasks to a variety of compute environments, including on premise High Performance Compute and High Throughput Computing (HPC/HTC) systems, Cloud computing platforms, and hybrid environments. The TES API is designed to be flexible and extensible, allowing it to be adapted to a wide range of use cases, such as "bringing compute to the data" solutions for federated and distributed data analysis or load balancing across multi cloud infrastructures. This API has been adopted by a number of different service providers and utilized by several workflow engines. Using its capabilities, genomes research institutes are building hybrid compute systems to study life science.
Paper Structure (6 sections, 3 figures, 1 table)

This paper contains 6 sections, 3 figures, 1 table.

Figures (3)

  • Figure :
  • Figure :
  • Figure :