Table of Contents
Fetching ...

The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence

Noah Mamie, Susie Xi Rao

TL;DR

The paper introduces the Society of HiveMind (SOHM), a graph-based framework that coordinates multiple foundation models as a swarm to enhance collective intelligence. It investigates two optimization paradigms—Darwinian (gradient-based and evolutionary) and Lamarckian (task-conditioned, experience-driven) adaptations—to learn effective inter-agent communication topologies. Across MMLU and MMLU-Pro benchmarks, SOHM yields significant gains in tasks requiring logical reasoning, particularly when employing specialist roles and Lamarckian adaptation, while showing limited improvements on knowledge-heavy tasks. The work demonstrates that intelligent orchestration of diverse foundation models can rival larger backbones, offering a path toward scalable, self-improving AI swarms with open-source reproducibility.

Abstract

Multi-agent systems address issues of accessibility and scalability of artificial intelligence (AI) foundation models, which are often represented by large language models. We develop a framework - the "Society of HiveMind" (SOHM) - that orchestrates the interaction between multiple AI foundation models, imitating the observed behavior of animal swarms in nature by following modern evolutionary theories. On the one hand, we find that the SOHM provides a negligible benefit on tasks that mainly require real-world knowledge. On the other hand, we remark a significant improvement on tasks that require intensive logical reasoning, indicating that multi-agent systems are capable of increasing the reasoning capabilities of the collective compared to the individual agents. Our findings demonstrate the potential of combining a multitude of diverse AI foundation models to form an artificial swarm intelligence capable of self-improvement through interactions with a given environment.

The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence

TL;DR

The paper introduces the Society of HiveMind (SOHM), a graph-based framework that coordinates multiple foundation models as a swarm to enhance collective intelligence. It investigates two optimization paradigms—Darwinian (gradient-based and evolutionary) and Lamarckian (task-conditioned, experience-driven) adaptations—to learn effective inter-agent communication topologies. Across MMLU and MMLU-Pro benchmarks, SOHM yields significant gains in tasks requiring logical reasoning, particularly when employing specialist roles and Lamarckian adaptation, while showing limited improvements on knowledge-heavy tasks. The work demonstrates that intelligent orchestration of diverse foundation models can rival larger backbones, offering a path toward scalable, self-improving AI swarms with open-source reproducibility.

Abstract

Multi-agent systems address issues of accessibility and scalability of artificial intelligence (AI) foundation models, which are often represented by large language models. We develop a framework - the "Society of HiveMind" (SOHM) - that orchestrates the interaction between multiple AI foundation models, imitating the observed behavior of animal swarms in nature by following modern evolutionary theories. On the one hand, we find that the SOHM provides a negligible benefit on tasks that mainly require real-world knowledge. On the other hand, we remark a significant improvement on tasks that require intensive logical reasoning, indicating that multi-agent systems are capable of increasing the reasoning capabilities of the collective compared to the individual agents. Our findings demonstrate the potential of combining a multitude of diverse AI foundation models to form an artificial swarm intelligence capable of self-improvement through interactions with a given environment.

Paper Structure

This paper contains 22 sections, 3 equations, 5 figures, 5 tables, 2 algorithms.

Figures (5)

  • Figure 1: Two-point crossover generating two offsprings from Parent 1 and Parent 2.
  • Figure 2: The HiveMind framework consists of setup, swarm and optimization phases. The main difference between HiveMind-D and HiveMind-L is that the latter conditions the graph sampling step on the task-specific encoding $\tau$.
  • Figure 3: Left: The attention mechanism $a({\boldsymbol{\Theta}}\boldsymbol{h}_v, {\boldsymbol{\Theta}}\boldsymbol{h}_u)$ employed by the SOHM, parametrized by a weight vector ${\boldsymbol{a}} \in \mathbb{R}^{2F'}$, applying a LeakyReLU activation. Right: The multi-head attention (featuring 3 heads) of node 1 on its neighborhood. Different arrow styles and colors denote independent attention computations. The aggregated features from each head are concatenated or averaged to obtain $\boldsymbol{h}_1'$. The illustrations are based on velivckovic2017graph, adapting the notation slightly.
  • Figure 4: Evolution of probability distribution for communication links in the swarm using a Genetic Algorithm (from left to right for generations 1, 30, 50). Darker colors indicate lower probability values and lighter colors higher probability values. Adjacency matrix indices represent specific agent nodes, where self-loops are masked and the node 6 (final decision) only features incoming edges.
  • Figure 5: Evolution of probability distribution for communication links in the swarm using policy gradient (PG) optimization with a parametrized baseline (from left to right and top to bottom for epochs 1, 20, 40, 60, 80, 100). Darker colors indicate lower probability values and lighter colors higher probability values. Adjacency matrix indices represent specific agent nodes, where self-loops are masked and the node 6 (final decision) only features incoming edges.