Table of Contents
Fetching ...

Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation

Humza Sami, Mubashir ul Islam, Samy Charas, Asav Gandhi, Pierre-Emmanuel Gaillardon, Valerio Tenace

TL;DR

Nexus introduces a lightweight, open-source framework for building LLM-based multi-agent systems with a flexible multi-supervisor hierarchy and YAML-based workflows. It demonstrates state-of-the-art-like performance across coding, math reasoning, and EDA timing-closure tasks, leveraging self-verifying workflows and external tools to improve accuracy and efficiency. The work highlights practical impact for rapid, domain-agnostic automation and reproducible AI-assisted engineering pipelines, supported by open-source tooling and structured agent collaboration. Overall, Nexus shows that low-code, modular MAS architectures can scale across diverse domains while delivering measurable gains in accuracy, efficiency, and power/or resource usage.

Abstract

Recent advancements in Large Language Models (LLMs) have substantially evolved Multi-Agent Systems (MASs) capabilities, enabling systems that not only automate tasks but also leverage near-human reasoning capabilities. To achieve this, LLM-based MASs need to be built around two critical principles: (i) a robust architecture that fully exploits LLM potential for specific tasks -- or related task sets -- and ($ii$) an effective methodology for equipping LLMs with the necessary capabilities to perform tasks and manage information efficiently. It goes without saying that a priori architectural designs can limit the scalability and domain adaptability of a given MAS. To address these challenges, in this paper we introduce Nexus: a lightweight Python framework designed to easily build and manage LLM-based MASs. Nexus introduces the following innovations: (i) a flexible multi-supervisor hierarchy, (ii) a simplified workflow design, and (iii) easy installation and open-source flexibility: Nexus can be installed via pip and is distributed under a permissive open-source license, allowing users to freely modify and extend its capabilities. Experimental results demonstrate that architectures built with Nexus exhibit state-of-the-art performance across diverse domains. In coding tasks, Nexus-driven MASs achieve a 99% pass rate on HumanEval and a flawless 100% on VerilogEval-Human, outperforming cutting-edge reasoning language models such as o3-mini and DeepSeek-R1. Moreover, these architectures display robust proficiency in complex reasoning and mathematical problem solving, achieving correct solutions for all randomly selected problems from the MATH dataset. In the realm of multi-objective optimization, Nexus-based architectures successfully address challenging timing closure tasks on designs from the VTR benchmark suite, while guaranteeing, on average, a power saving of nearly 30%.

Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation

TL;DR

Nexus introduces a lightweight, open-source framework for building LLM-based multi-agent systems with a flexible multi-supervisor hierarchy and YAML-based workflows. It demonstrates state-of-the-art-like performance across coding, math reasoning, and EDA timing-closure tasks, leveraging self-verifying workflows and external tools to improve accuracy and efficiency. The work highlights practical impact for rapid, domain-agnostic automation and reproducible AI-assisted engineering pipelines, supported by open-source tooling and structured agent collaboration. Overall, Nexus shows that low-code, modular MAS architectures can scale across diverse domains while delivering measurable gains in accuracy, efficiency, and power/or resource usage.

Abstract

Recent advancements in Large Language Models (LLMs) have substantially evolved Multi-Agent Systems (MASs) capabilities, enabling systems that not only automate tasks but also leverage near-human reasoning capabilities. To achieve this, LLM-based MASs need to be built around two critical principles: (i) a robust architecture that fully exploits LLM potential for specific tasks -- or related task sets -- and () an effective methodology for equipping LLMs with the necessary capabilities to perform tasks and manage information efficiently. It goes without saying that a priori architectural designs can limit the scalability and domain adaptability of a given MAS. To address these challenges, in this paper we introduce Nexus: a lightweight Python framework designed to easily build and manage LLM-based MASs. Nexus introduces the following innovations: (i) a flexible multi-supervisor hierarchy, (ii) a simplified workflow design, and (iii) easy installation and open-source flexibility: Nexus can be installed via pip and is distributed under a permissive open-source license, allowing users to freely modify and extend its capabilities. Experimental results demonstrate that architectures built with Nexus exhibit state-of-the-art performance across diverse domains. In coding tasks, Nexus-driven MASs achieve a 99% pass rate on HumanEval and a flawless 100% on VerilogEval-Human, outperforming cutting-edge reasoning language models such as o3-mini and DeepSeek-R1. Moreover, these architectures display robust proficiency in complex reasoning and mathematical problem solving, achieving correct solutions for all randomly selected problems from the MATH dataset. In the realm of multi-objective optimization, Nexus-based architectures successfully address challenging timing closure tasks on designs from the VTR benchmark suite, while guaranteeing, on average, a power saving of nearly 30%.

Paper Structure

This paper contains 58 sections, 7 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Evolution of Multi-Agent System Architectures: a) Traditional MAS Architecture, where agents interact with their environment through observations and actions; b) ReAct Architecture, an innovative agent design that incorporates advanced reasoning capabilities; and c) LLM-Based MAS Architecture, a cutting-edge approach leveraging LLMs for reasoning and decision-making.
  • Figure 2: Overview of the Nexus architecture. A root Supervisor receives user prompts and decides whether to finalize the solution or delegate its execution. Tasks of moderate complexity can be handled by specialized Worker agents, while particularly intricate tasks can be coordinated by intermediate Task Supervisors. Memory maintains a synchronized record of partial outputs and relevant context. Circled markers denote the three main loops that are entailed in the proposed workflow.
  • Figure 3: Unified Nexus-based MAS architecture for solving code-related tasks.
  • Figure 4: Comparison of reasoning models and the Nexus self-verifying workflow for code-related tasks.
  • Figure 5: Proposed Nexus-based MAS architecture for solving problems from the MATH dataset.
  • ...and 1 more figures