TRINITY: An Evolved LLM Coordinator

Jinglue Xu; Qi Sun; Peter Schwendeman; Stefan Nielsen; Edoardo Cetin; Yujin Tang

TRINITY: An Evolved LLM Coordinator

Jinglue Xu, Qi Sun, Peter Schwendeman, Stefan Nielsen, Edoardo Cetin, Yujin Tang

TL;DR

This work introduces Trinity, a lightweight coordinator that orchestrates multiple diverse LLMs without weight merging by leveraging hidden-state signals from a compact 0.6B SLM and a 10K-parameter head. It employs tri-role coordination (Thinker, Worker, Verifier) over multi-turn interactions and trains the coordinator with separable-CMA-ES to exploit block-epsilon separability under tight evaluation budgets. Trinity achieves state-of-the-art results on LiveCodeBench and demonstrates strong zero-shot generalization to unseen tasks, supported by analyses of representation separability and objective separability. The approach suggests a scalable path for collaborative AI systems by engineering effective, budget-conscious model coordination rather than pursuing further monolithic scaling.

Abstract

Combining diverse foundation models is promising, but weight-merging is limited by mismatched architectures and closed APIs. Trinity addresses this with a lightweight coordinator that orchestrates collaboration among large language models (LLMs). The coordinator, comprising a compact language model (approximately $0.6$B parameters) and a lightweight head (approximately $10$K parameters), is optimized with an evolutionary strategy for efficient and adaptive delegation. Trinity processes queries over multiple turns, where at each turn the coordinator assigns one of three roles (Thinker, Worker, or Verifier) to a selected LLM, effectively offloading complex skill acquisition from the coordinator itself. Experiments show that Trinity consistently outperforms individual models and existing methods across coding, math, reasoning, and domain knowledge tasks, and generalizes robustly to out-of-distribution tasks. On standard benchmarks, Trinity achieves state-of-the-art results, including a score of 86.2% on LiveCodeBench. Theoretical and empirical analyses identify two main factors behind this performance: (1) the coordinator's hidden-state representations provide rich contextualization of inputs, and (2) under high dimensionality and strict budget constraints, the separable Covariance Matrix Adaptation Evolution Strategy offers advantages over reinforcement learning, imitation learning, and random search by exploiting potential block-epsilon-separability.

TRINITY: An Evolved LLM Coordinator

TL;DR

Abstract

B parameters) and a lightweight head (approximately

K parameters), is optimized with an evolutionary strategy for efficient and adaptive delegation. Trinity processes queries over multiple turns, where at each turn the coordinator assigns one of three roles (Thinker, Worker, or Verifier) to a selected LLM, effectively offloading complex skill acquisition from the coordinator itself. Experiments show that Trinity consistently outperforms individual models and existing methods across coding, math, reasoning, and domain knowledge tasks, and generalizes robustly to out-of-distribution tasks. On standard benchmarks, Trinity achieves state-of-the-art results, including a score of 86.2% on LiveCodeBench. Theoretical and empirical analyses identify two main factors behind this performance: (1) the coordinator's hidden-state representations provide rich contextualization of inputs, and (2) under high dimensionality and strict budget constraints, the separable Covariance Matrix Adaptation Evolution Strategy offers advantages over reinforcement learning, imitation learning, and random search by exploiting potential block-epsilon-separability.

TRINITY: An Evolved LLM Coordinator

TL;DR

Abstract

TRINITY: An Evolved LLM Coordinator

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (5)