Table of Contents
Fetching ...

SMAGDi: Socratic Multi Agent Interaction Graph Distillation for Efficient High Accuracy Reasoning

Aayush Aluru, Myra Malik, Samarth Patankar, Spencer Kim, Kevin Zhu, Sean O'Brien, Vasu Sharma

TL;DR

SMAGDi tackles the cost of multi-agent reasoning by distilling a 40B five-agent MAS into a 6B decomposer-solver using Socratic Chain-of-Thought and interaction-graph distillation. It represents MAS debates as directed graphs with nodes as reasoning steps and edges capturing continuity and cross-agent influence, and trains the student with a composite loss combining language modeling, graph supervision, contrastive reasoning, and embedding alignment. On StrategyQA and MMLU, SMAGDi preserves about 88% of the teacher’s accuracy while reducing parameters by roughly 7x, outperforming MAGDi, standard KD, and a fine-tuned baseline. This approach demonstrates that structured, Socratic reasoning transferred via interaction graphs enables compact models to emulate MAS accuracy with deployment-ready efficiency.

Abstract

Multi-agent systems (MAS) often achieve higher reasoning accuracy than single models, but their reliance on repeated debates across agents makes them computationally expensive. We introduce SMAGDi, a distillation framework that transfers the debate dynamics of a five-agent Llama-based MAS into a compact Socratic decomposer-solver student. SMAGDi represents debate traces as directed interaction graphs, where nodes encode intermediate reasoning steps with correctness labels and edges capture continuity and cross-agent influence. The student is trained with a composite objective combining language modeling, graph-based supervision, contrastive reasoning, and embedding alignment to preserve both fluency and structured reasoning. On StrategyQA and MMLU, SMAGDi compresses a 40B multi-agent system into a 6B student while retaining 88% of its accuracy, substantially outperforming prior distillation methods such as MAGDi, standard KD, and fine-tuned baselines. These results highlight that explicitly modeling interaction graphs and Socratic decomposition enable small models to inherit the accuracy benefits of multi-agent debate while remaining efficient enough for real-world deployment.

SMAGDi: Socratic Multi Agent Interaction Graph Distillation for Efficient High Accuracy Reasoning

TL;DR

SMAGDi tackles the cost of multi-agent reasoning by distilling a 40B five-agent MAS into a 6B decomposer-solver using Socratic Chain-of-Thought and interaction-graph distillation. It represents MAS debates as directed graphs with nodes as reasoning steps and edges capturing continuity and cross-agent influence, and trains the student with a composite loss combining language modeling, graph supervision, contrastive reasoning, and embedding alignment. On StrategyQA and MMLU, SMAGDi preserves about 88% of the teacher’s accuracy while reducing parameters by roughly 7x, outperforming MAGDi, standard KD, and a fine-tuned baseline. This approach demonstrates that structured, Socratic reasoning transferred via interaction graphs enables compact models to emulate MAS accuracy with deployment-ready efficiency.

Abstract

Multi-agent systems (MAS) often achieve higher reasoning accuracy than single models, but their reliance on repeated debates across agents makes them computationally expensive. We introduce SMAGDi, a distillation framework that transfers the debate dynamics of a five-agent Llama-based MAS into a compact Socratic decomposer-solver student. SMAGDi represents debate traces as directed interaction graphs, where nodes encode intermediate reasoning steps with correctness labels and edges capture continuity and cross-agent influence. The student is trained with a composite objective combining language modeling, graph-based supervision, contrastive reasoning, and embedding alignment to preserve both fluency and structured reasoning. On StrategyQA and MMLU, SMAGDi compresses a 40B multi-agent system into a 6B student while retaining 88% of its accuracy, substantially outperforming prior distillation methods such as MAGDi, standard KD, and fine-tuned baselines. These results highlight that explicitly modeling interaction graphs and Socratic decomposition enable small models to inherit the accuracy benefits of multi-agent debate while remaining efficient enough for real-world deployment.

Paper Structure

This paper contains 53 sections, 10 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The overarching training pipeline for the creation of SMAGDi's Multi-Agent Interaction Graphs (MAGs) with dynamic weighting, graph construction, and consensus mechanisms.
  • Figure 2: NetworkX graph representation with weighted influence edges, and node correctness for GCN traversal during distillation