Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge
Charlie Masters, Advaith Vellanki, Jiangbo Shangguan, Bart Kultys, Jonathan Gilmore, Alastair Moore, Stefano V. Albrecht
TL;DR
Problem: End-to-end workflow management in dynamic human-AI teams is a critical open challenge. Approach: formalize the Manager Agent within a Partially Observable Stochastic Game ($POSG$) and provide MA-Gym to simulate and benchmark workflows. Contributions: a formal POSG framework, four foundational challenges, MA-Gym release, and GPT-5-based evaluations across 20 workflows. Significance: demonstrates that jointly optimizing for goal completion, constraint adherence, and runtime remains hard, underscoring the need for governance, fairness, and privacy safeguards in autonomous management systems.
Abstract
While agentic AI has advanced in automating individual tasks, managing complex multi-agent workflows remains a challenging problem. This paper presents a research vision for autonomous agentic systems that orchestrate collaboration within dynamic human-AI teams. We propose the Autonomous Manager Agent as a core challenge: an agent that decomposes complex goals into task graphs, allocates tasks to human and AI workers, monitors progress, adapts to changing conditions, and maintains transparent stakeholder communication. We formalize workflow management as a Partially Observable Stochastic Game and identify four foundational challenges: (1) compositional reasoning for hierarchical decomposition, (2) multi-objective optimization under shifting preferences, (3) coordination and planning in ad hoc teams, and (4) governance and compliance by design. To advance this agenda, we release MA-Gym, an open-source simulation and evaluation framework for multi-agent workflow orchestration. Evaluating GPT-5-based Manager Agents across 20 workflows, we find they struggle to jointly optimize for goal completion, constraint adherence, and workflow runtime - underscoring workflow management as a difficult open problem. We conclude with organizational and ethical implications of autonomous management systems.
