Table of Contents
Fetching ...

SWE-Hub: A Unified Production System for Scalable, Executable Software Engineering Tasks

Yucheng Zeng, Shupeng Li, Daxiang Dong, Ruijie Xu, Zimo Chen, Liwei Zheng, Yuxuan Li, Zhe Zhou, Haotian Zhao, Lun Tian, Heng Xiao, Tianshu Zhu, Longkun Hao, Jianmin Wu

TL;DR

An end-to-end system that operationalizes the data factory abstraction by unifying environment automation, scalable synthesis, and diverse task generation into a coherent production stack, and establishes a unified production pipeline capable of continuously delivering executable tasks across the entire software engineering lifecycle.

Abstract

Progress in software-engineering agents is increasingly constrained by the scarcity of executable, scalable, and realistic data for training and evaluation. This scarcity stems from three fundamental challenges in existing pipelines: environments are brittle and difficult to reproduce across languages; synthesizing realistic, system-level bugs at scale is computationally expensive; and existing data predominantly consists of short-horizon repairs, failing to capture long-horizon competencies like architectural consistency. We introduce \textbf{SWE-Hub}, an end-to-end system that operationalizes the data factory abstraction by unifying environment automation, scalable synthesis, and diverse task generation into a coherent production stack. At its foundation, the \textbf{Env Agent} establishes a shared execution substrate by automatically converting raw repository snapshots into reproducible, multi-language container environments with standardized interfaces. Built upon this substrate, \textbf{SWE-Scale} engine addresses the need for high-throughput generation, combining cross-language code analysis with cluster-scale validation to synthesize massive volumes of localized bug-fix instances. \textbf{Bug Agent} generates high-fidelity repair tasks by synthesizing system-level regressions involving cross-module dependencies, paired with user-like issue reports that describe observable symptoms rather than root causes. Finally, \textbf{SWE-Architect} expands the task scope from repair to creation by translating natural-language requirements into repository-scale build-a-repo tasks. By integrating these components, SWE-Hub establishes a unified production pipeline capable of continuously delivering executable tasks across the entire software engineering lifecycle.

SWE-Hub: A Unified Production System for Scalable, Executable Software Engineering Tasks

TL;DR

An end-to-end system that operationalizes the data factory abstraction by unifying environment automation, scalable synthesis, and diverse task generation into a coherent production stack, and establishes a unified production pipeline capable of continuously delivering executable tasks across the entire software engineering lifecycle.

Abstract

Progress in software-engineering agents is increasingly constrained by the scarcity of executable, scalable, and realistic data for training and evaluation. This scarcity stems from three fundamental challenges in existing pipelines: environments are brittle and difficult to reproduce across languages; synthesizing realistic, system-level bugs at scale is computationally expensive; and existing data predominantly consists of short-horizon repairs, failing to capture long-horizon competencies like architectural consistency. We introduce \textbf{SWE-Hub}, an end-to-end system that operationalizes the data factory abstraction by unifying environment automation, scalable synthesis, and diverse task generation into a coherent production stack. At its foundation, the \textbf{Env Agent} establishes a shared execution substrate by automatically converting raw repository snapshots into reproducible, multi-language container environments with standardized interfaces. Built upon this substrate, \textbf{SWE-Scale} engine addresses the need for high-throughput generation, combining cross-language code analysis with cluster-scale validation to synthesize massive volumes of localized bug-fix instances. \textbf{Bug Agent} generates high-fidelity repair tasks by synthesizing system-level regressions involving cross-module dependencies, paired with user-like issue reports that describe observable symptoms rather than root causes. Finally, \textbf{SWE-Architect} expands the task scope from repair to creation by translating natural-language requirements into repository-scale build-a-repo tasks. By integrating these components, SWE-Hub establishes a unified production pipeline capable of continuously delivering executable tasks across the entire software engineering lifecycle.
Paper Structure (85 sections, 4 equations, 5 figures, 1 table)

This paper contains 85 sections, 4 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: SWE-Hub architecture. Starting from raw source repositories, the Env Agent provisions a deterministic execution substrate—pinned container images, a unified verification entrypoint, and deterministic artifacts—to make code runnable and testable. On top of this substrate, three task generators produce complementary SWE workloads: SWE-Scale for high-throughput local repair edits, Bug Agent for realism-oriented repo-level regressions with user-like symptoms and no root-cause hints, and SWE-Architect for building multi-file implementations from structured specifications.
  • Figure 2: SWE-Hub Environment Layer pipeline.Phase 1 (Env Agent) performs environment provisioning and build readiness, identifying toolchains and installing dependencies to achieve environment readiness without requiring tests to pass. Phase 2 (Test Agent) establishes a unified verification interface via entrypoint discovery and result standardization, employing dynamic adapters to handle heterogeneous ecosystems where native reporting is non-standard. Phase 3 (Solidification and Verification Gate) encapsulates the environment state and verification logic into a pinned container image, applying a strict execution check to ensure the substrate is deterministic and reusable for downstream task generation.
  • Figure 3: SWE-Scale architecture. The system transforms an Environment-Ready Substrate into executable bug-fix instances. The pipeline follows a structured flow: the Shared Parsing Backbone performs polyglot entity analysis using Tree-sitter ASTs to identify targets; the Candidate Generator synthesizes fault specifications via procedural or LLM-assisted strategies; and the Kubernetes-Scale Verification Cluster materializes patches in isolated sandbox pods, executing tests and performing behavioral validity checks to produce verified task records.
  • Figure 4: Workflow of the Realism Layer in SWE-Hub. The pipeline initiates with Repo Analysis to extract call graphs and test baselines. The Bug Agent (center) then iteratively synthesizes system-level regressions that trigger non-local failures, validated by a Pass-to-Fail (P2F) signal. Finally, the Issue Agent (right) generates a user-like report from the verified regression, enforcing strict symptom-cause separation to prevent information leakage while adopting diverse personas.
  • Figure 5: SWE-Architect pipeline. (1) Mine a cohesive task scope from a real repository using coverage structure. (2) Create an initial repository state via code hollowing, and retain a golden reference patch. (3) Generate a structured requirement document via a two-agent reverse-engineering process (DOC Agent + API Agent). (4) Package the task with the pinned execution substrate and hidden tests in the unified task schema.