ChatCollab: Exploring Collaboration Between Humans and AI Agents in Software Teams
Benjamin Klieger, Charis Charitsis, Miroslav Suzara, Sierra Wang, Nick Haber, John C. Mitchell
TL;DR
ChatCollab introduces a configurable, peer-based framework for human-AI collaboration in software teams, featuring autonomous AI agents that inhabit distinct roles and communicate via a shared event timeline in Slack. The authors develop an automated collaboration-analysis pipeline using IPA-inspired labeling and validate it with a tic-tac-toe software-development case study, comparing ChatCollab to prior multi-agent systems and demonstrating comparable or superior code quality. Key contributions include the ChatCollab architecture, a scalable prompting approach to modulate collaboration dynamics, and empirical evidence that role-specific prompts shape interaction patterns. The work underscores the practicality of hybrid human-AI teams for complex tasks and provides open-source data and code to enable further research and experimentation in collaboration dynamics and software development. The findings have implications for designing flexible, human-centered AI collaboration in real-world teams and educational settings.
Abstract
We explore the potential for productive team-based collaboration between humans and Artificial Intelligence (AI) by presenting and conducting initial tests with a general framework that enables multiple human and AI agents to work together as peers. ChatCollab's novel architecture allows agents - human or AI - to join collaborations in any role, autonomously engage in tasks and communication within Slack, and remain agnostic to whether their collaborators are human or AI. Using software engineering as a case study, we find that our AI agents successfully identify their roles and responsibilities, coordinate with other agents, and await requested inputs or deliverables before proceeding. In relation to three prior multi-agent AI systems for software development, we find ChatCollab AI agents produce comparable or better software in an interactive game development task. We also propose an automated method for analyzing collaboration dynamics that effectively identifies behavioral characteristics of agents with distinct roles, allowing us to quantitatively compare collaboration dynamics in a range of experimental conditions. For example, in comparing ChatCollab AI agents, we find that an AI CEO agent generally provides suggestions 2-4 times more often than an AI product manager or AI developer, suggesting agents within ChatCollab can meaningfully adopt differentiated collaborative roles. Our code and data can be found at: https://github.com/ChatCollab.
