BioAgents: Democratizing Bioinformatics Analysis with Multi-Agent Systems
Nikita Mehandru, Amanda K. Hall, Olesya Melnichenko, Yulia Dubinina, Daniel Tsirulnikov, David Bamman, Ahmed Alaa, Scott Saponas, Venkat S. Malladi
TL;DR
BioAgents tackles the challenge of democratizing bioinformatics analysis by deploying a multi-agent system built on small language models fine-tuned for bioinformatics tasks and enhanced with retrieval augmented generation. It demonstrates near-human performance on conceptual genomics tasks and competitive results on easy code-generation tasks, while highlighting gaps in handling complex, end-to-end pipelines. The work emphasizes reliability, transparency, and metacognitive awareness through self-reflection and collaborative reasoning, and discusses implications for reproducibility and broader applicability in science and medicine. Targeted improvements—broader workflow indexing, richer retrieval, and enhanced reasoning—promise greater robustness and local applicability with reduced computational requirements.
Abstract
Creating end-to-end bioinformatics workflows requires diverse domain expertise, which poses challenges for both junior and senior researchers as it demands a deep understanding of both genomics concepts and computational techniques. While large language models (LLMs) provide some assistance, they often fall short in providing the nuanced guidance needed to execute complex bioinformatics tasks, and require expensive computing resources to achieve high performance. We thus propose a multi-agent system built on small language models, fine-tuned on bioinformatics data, and enhanced with retrieval augmented generation (RAG). Our system, BioAgents, enables local operation and personalization using proprietary data. We observe performance comparable to human experts on conceptual genomics tasks, and suggest next steps to enhance code generation capabilities.
