Language Model Powered Digital Biology with BRAD
Joshua Pickard, Ram Prakash, Marc Andrew Choi, Natalie Oliven, Cooper Stansbury, Jillian Cwycyshyn, Alex Gorodetsky, Alvaro Velasquez, Indika Rajapakse
TL;DR
BRAD presents a retrieval-augmented digital assistant that integrates LLMs with diverse bioinformatics tools, databases, and software pipelines through an agent-based architecture. The system modularly connects a configurable Agent to document repositories, online literature, and external software, enabling end-to-end biomarker workflows with Grounded, verifiable outputs. Benchmarking and RAG Assessment indicate improved faithfulness and relevance for BRAD’s responses, while also exposing cost considerations and limitations of LLM-driven biomarker discovery. Overall, BRAD demonstrates a flexible, extensible framework for deploying LLM-powered, tool-rich bioinformatics assistants in research settings.
Abstract
Recent advancements in Large Language Models (LLMs) are transforming biology, computer science, engineering, and every day life. However, integrating the wide array of computational tools, databases, and scientific literature continues to pose a challenge to biological research. LLMs are well-suited for unstructured integration, efficient information retrieval, and automating standard workflows and actions from these diverse resources. To harness these capabilities in bioinformatics, we present a prototype Bioinformatics Retrieval Augmented Digital assistant (BRAD). BRAD is a chatbot and agentic system that integrates a variety of bioinformatics tools. The Python package implements an AI \texttt{Agent} that is powered by LLMs and connects to a local file system, online databases, and a user's software. The \texttt{Agent} is highly configurable, enabling tasks such as Retrieval-Augmented Generation, searches across bioinformatics databases, and the execution of software pipelines. BRAD's coordinated integration of bioinformatics tools delivers a context-aware and semi-autonomous system that extends beyond the capabilities of conventional LLM-based chatbots. A graphical user interface (GUI) provides an intuitive interface to the system.
