SMAGDi: Socratic Multi Agent Interaction Graph Distillation for Efficient High Accuracy Reasoning
Aayush Aluru, Myra Malik, Samarth Patankar, Spencer Kim, Kevin Zhu, Sean O'Brien, Vasu Sharma
TL;DR
SMAGDi tackles the cost of multi-agent reasoning by distilling a 40B five-agent MAS into a 6B decomposer-solver using Socratic Chain-of-Thought and interaction-graph distillation. It represents MAS debates as directed graphs with nodes as reasoning steps and edges capturing continuity and cross-agent influence, and trains the student with a composite loss combining language modeling, graph supervision, contrastive reasoning, and embedding alignment. On StrategyQA and MMLU, SMAGDi preserves about 88% of the teacher’s accuracy while reducing parameters by roughly 7x, outperforming MAGDi, standard KD, and a fine-tuned baseline. This approach demonstrates that structured, Socratic reasoning transferred via interaction graphs enables compact models to emulate MAS accuracy with deployment-ready efficiency.
Abstract
Multi-agent systems (MAS) often achieve higher reasoning accuracy than single models, but their reliance on repeated debates across agents makes them computationally expensive. We introduce SMAGDi, a distillation framework that transfers the debate dynamics of a five-agent Llama-based MAS into a compact Socratic decomposer-solver student. SMAGDi represents debate traces as directed interaction graphs, where nodes encode intermediate reasoning steps with correctness labels and edges capture continuity and cross-agent influence. The student is trained with a composite objective combining language modeling, graph-based supervision, contrastive reasoning, and embedding alignment to preserve both fluency and structured reasoning. On StrategyQA and MMLU, SMAGDi compresses a 40B multi-agent system into a 6B student while retaining 88% of its accuracy, substantially outperforming prior distillation methods such as MAGDi, standard KD, and fine-tuned baselines. These results highlight that explicitly modeling interaction graphs and Socratic decomposition enable small models to inherit the accuracy benefits of multi-agent debate while remaining efficient enough for real-world deployment.
