Infrastructure Engineering: A Still Missing, Undervalued Role in the Research Ecosystem
Vanessa Sochat
TL;DR
Modern science depends on software, yet core research infrastructure—such as compilers, orchestration, developer environments, containers, and workflow managers—lacks a dedicated workforce. The paper analyzes incentives from academia and computing providers, reviews historical developments in container and workflow technologies, and argues that proactive infrastructure engineers are needed to enable converged computing across cloud and HPC. It outlines challenges (paradigms, transparency, community participation, developer environments) and presents possible futures focused on workload reproducibility, performance, and portability, with explicit reader actions. The work highlights the practical impact of investing in infrastructure engineering to improve interoperability, reproducibility, and efficiency in scientific computing.
Abstract
Research has become increasingly reliant on software, serving as the driving force behind bioinformatics, high performance computing, physics, machine learning and artificial intelligence, to name a few. While substantial progress has been made in advocating for the research software engineer, a kind of software engineer that typically works directly on software and associated assets that go into research, little attention has been placed on the workforce behind research infrastructure and innovation, namely compilers and compatibility tool development, orchestration and scheduling infrastructure, developer environments, container technologies, and workflow managers. As economic incentives are moving toward different models of cloud computing and innovating is required to develop new paradigms that represent the best of both worlds, an effort called "converged computing," the need for such a role is not just ideal, but essential for the continued success of science. While scattered staff in non-traditional roles have found time to work on some facets of this space, the lack of a larger workforce and incentive to support it has led to the scientific community falling behind. In this article we will highlight the importance of this missing layer, providing examples of how a missing role of infrastructure engineer has led to inefficiencies in the interoperability, portability, and reproducibility of science. We suggest that an inability to allocate, provide resources for, and sustain individuals to work explicitly on these technologies could lead to possible futures that are sub-optimal for the continued success of our scientific communities.
