Table of Contents
Fetching ...

Disentangling Causal Importance from Emergent Structure in Multi-Expert Orchestration

Sudipto Ghosh, Sujoy Nath, Sunny Manchanda, Tanmoy Chakraborty

TL;DR

This work introduces INFORM, an interpretability framework that treats multi-expert LLM orchestration as explicit computation, enabling decoupled analysis of interaction structure, execution order, and causal attribution. By contrasting relational routing mass with gradient-based intrinsic attribution, INFORM reveals that frequently routed experts are not always causally essential and that interaction hubs can drive structural dependencies. Across GSM8K, HumanEval, and MMLU, experiments show asynchronous emergence of routing confidence and centralization, task-dependent ordering, and robust responses to perturbations, with ablations confirming causal roles beyond accuracy. The framework demonstrates practical value for diagnosing brittle coordination, enabling principled pruning and efficiency gains without altering the underlying orchestration protocol.

Abstract

Multi-expert systems, where multiple Large Language Models (LLMs) collaborate to solve complex tasks, are increasingly adopted for high-performance reasoning and generation. However, the orchestration policies governing expert interaction and sequencing remain largely opaque. We introduce INFORM, an interpretability analysis that treats orchestration as an explicit, analyzable computation, enabling the decoupling of expert interaction structure, execution order, and causal attribution. We use INFORM to evaluate an orchestrator on GSM8K, HumanEval, and MMLU using a homogeneous consortium of ten instruction-tuned experts drawn from LLaMA-3.1 8B, Qwen-3 8B, and DeepSeek-R1 8B, with controlled decoding-temperature variation, and a secondary heterogeneous consortium spanning 1B-7B parameter models. Across tasks, routing dominance is a poor proxy for functional necessity. We reveal a divergence between relational importance, captured by routing mass and interaction topology, and intrinsic importance, measured via gradient-based causal attribution: frequently selected experts often act as interaction hubs with limited causal influence, while sparsely routed experts can be structurally critical. Orchestration behaviors emerge asynchronously, with expert centralization preceding stable routing confidence and expert ordering remaining non-deterministic. Targeted ablations show that masking intrinsically important experts induces disproportionate collapse in interaction structure compared to masking frequent peers, confirming that INFORM exposes causal and structural dependencies beyond accuracy metrics alone.

Disentangling Causal Importance from Emergent Structure in Multi-Expert Orchestration

TL;DR

This work introduces INFORM, an interpretability framework that treats multi-expert LLM orchestration as explicit computation, enabling decoupled analysis of interaction structure, execution order, and causal attribution. By contrasting relational routing mass with gradient-based intrinsic attribution, INFORM reveals that frequently routed experts are not always causally essential and that interaction hubs can drive structural dependencies. Across GSM8K, HumanEval, and MMLU, experiments show asynchronous emergence of routing confidence and centralization, task-dependent ordering, and robust responses to perturbations, with ablations confirming causal roles beyond accuracy. The framework demonstrates practical value for diagnosing brittle coordination, enabling principled pruning and efficiency gains without altering the underlying orchestration protocol.

Abstract

Multi-expert systems, where multiple Large Language Models (LLMs) collaborate to solve complex tasks, are increasingly adopted for high-performance reasoning and generation. However, the orchestration policies governing expert interaction and sequencing remain largely opaque. We introduce INFORM, an interpretability analysis that treats orchestration as an explicit, analyzable computation, enabling the decoupling of expert interaction structure, execution order, and causal attribution. We use INFORM to evaluate an orchestrator on GSM8K, HumanEval, and MMLU using a homogeneous consortium of ten instruction-tuned experts drawn from LLaMA-3.1 8B, Qwen-3 8B, and DeepSeek-R1 8B, with controlled decoding-temperature variation, and a secondary heterogeneous consortium spanning 1B-7B parameter models. Across tasks, routing dominance is a poor proxy for functional necessity. We reveal a divergence between relational importance, captured by routing mass and interaction topology, and intrinsic importance, measured via gradient-based causal attribution: frequently selected experts often act as interaction hubs with limited causal influence, while sparsely routed experts can be structurally critical. Orchestration behaviors emerge asynchronously, with expert centralization preceding stable routing confidence and expert ordering remaining non-deterministic. Targeted ablations show that masking intrinsically important experts induces disproportionate collapse in interaction structure compared to masking frequent peers, confirming that INFORM exposes causal and structural dependencies beyond accuracy metrics alone.
Paper Structure (89 sections, 6 equations, 15 figures, 11 tables)

This paper contains 89 sections, 6 equations, 15 figures, 11 tables.

Figures (15)

  • Figure 1: Probing multi-expert orchestration with INFORM. The figure illustrates where insights are extracted during the inference process: (1) Probing the interaction module reveals the collaboration topology; (2) Analyzing the selection mechanism exposes ordering heuristics; and (3) Backpropagating decisions to expert representations isolates causal attribution distinct from observed routing frequency. The symbol // denotes a collection of outputs produced by multiple experts.
  • Figure 2: Evolution of the collaboration matrix $\mathbf{C}(x)$ during training, averaged over the test set. (a) At the start of training, the orchestrator explores random connections, resulting in a diffuse matrix. (b) By Epoch 5, distinct vertical bands appear. This structure indicates that the model has identified universal successors.
  • Figure 3: Testing what the orchestrator actually cares about. The plots show how routing entropy shifts when we damage the input. (a) On GSM8K, removing numbers (blue) causes the biggest reaction, confirming the model relies on numerical tokens. (b, c) On HumanEval and MMLU, shuffling sentences disrupts the model more, showing it depends on structural and semantic cues.
  • Figure 4: Intrinsic Expert Importance (Gradient Attribution). These heatmaps visualize which experts actually drive the orchestrator's decisions. Darker cells indicate experts with higher gradient norms, which means that the orchestrator relies heavily on their internal representations. The sparse patters reveal that only a small subset of experts are actually necessary.
  • Figure 5: Relational Importance (Incoming Routing Mass). These heatmaps show which experts get selected the most often by others. By comparing this to Figure \ref{['fig:grad_attribution']}, we can see alignment gaps, i.e., experts that appear popular here (high routing mass) but were not as important in Figure \ref{['fig:grad_attribution']} (low gradient influence).
  • ...and 10 more figures