CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis

Zhen Wang; Yiming Gao; Jieyuan Liu; Enze Ma; Jefferson Chen; Mark Antkowiak; Mengzhou Hu; JungHo Kong; Dexter Pratt; Zhiting Hu; Wei Wang; Trey Ideker; Eric P. Xing

CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis

Zhen Wang, Yiming Gao, Jieyuan Liu, Enze Ma, Jefferson Chen, Mark Antkowiak, Mengzhou Hu, JungHo Kong, Dexter Pratt, Zhiting Hu, Wei Wang, Trey Ideker, Eric P. Xing

TL;DR

CellMaster addresses the bottleneck of cell-type annotation in large-scale scRNA-seq by introducing a zero-shot, LLM-driven agent that mimics expert reasoning and provides interpretable rationales. Its iterative four-stage pipeline—hypothesis generation, marker selection, expression analysis, and result evaluation—supports both automatic and human-in-the-loop modes, yielding substantial accuracy gains across diverse tissues. In benchmarking across nine datasets, CellMaster achieves an average improvement of about 7.1% over strong baselines in automatic mode and up to 18.6% with human feedback, while handling rare and novel cell states more robustly. The work highlights the value of transparent AI-assisted workflows with provenance-tracked human collaboration for scalable, biologically faithful annotation in evolving single-cell atlases, and provides open-source tools for broad adoption.

Abstract

Single-cell RNA-seq (scRNA-seq) enables atlas-scale profiling of complex tissues, revealing rare lineages and transient states. Yet, assigning biologically valid cell identities remains a bottleneck because markers are tissue- and state-dependent, and novel states lack references. We present CellMaster, an AI agent that mimics expert practice for zero-shot cell-type annotation. Unlike existing automated tools, CellMaster leverages LLM-encoded knowledge (e.g., GPT-4o) to perform on-the-fly annotation with interpretable rationales, without pre-training or fixed marker databases. Across 9 datasets spanning 8 tissues, CellMaster improved accuracy by 7.1% over best-performing baselines (including CellTypist and scTab) in automatic mode. With human-in-the-loop refinement, this advantage increased to 18.6%, with a 22.1% gain on subtype populations. The system demonstrates particular strength in rare and novel cell states where baselines often fail. Source code and the web application are available at \href{https://github.com/AnonymousGym/CellMaster}{https://github.com/AnonymousGym/CellMaster}.

CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis

TL;DR

Abstract

Paper Structure (55 sections, 5 equations, 7 figures, 2 tables)

This paper contains 55 sections, 5 equations, 7 figures, 2 tables.

Introduction
Related Works
Methodology
Data Sources and Pre-processing
Iterative Annotation Pipeline
(i) Hypothesis Generation.
(ii) Marker Selection.
(iii) Expression Analysis.
(iv) Result Evaluation.
Benchmark Design and Baselines
Baseline Methods Included
Web Application and UI Design
Results
CellMaster architecture and workflow
Benchmarking against state-of-the-art tools
...and 40 more sections

Figures (7)

Figure 1: Overview of CellMaster architecture and workflow.(a) System pipeline: three specialized agents (Hypothesis, Proposal, Annotation) coordinate to iteratively refine cell type annotations from user-provided data and context. (b) Example human-AI dialogue showing iterative annotation refinement on a PBMC dataset.
Figure 2: CellMaster web interface.(a) Main annotation view with input, dotplot, chatbot, and analysis panels. (b) Zoom-in/out functionality for sub-clustering selected cell populations. (c) Analysis panel showing hypothesis and marker gene lists. (d) Annotation panel with interactive UMAP and cluster controls.
Figure 3: Benchmarking results.(a) Performance heatmap (CL score) comparing CellMaster against four baselines across 9 datasets; white indicates method failure. (b) Head-to-head comparison with Biomni on three representative datasets. (c) Iterative improvement: automatic vs. human-in-the-loop mode on Liver dataset across iterations. (d) Dot plot showing per-dataset performance for automatic (red) and human-in-the-loop (green) modes. (e) Stratified bar chart analysis by cell type category, annotation granularity, dataset size, and cluster size.
Figure 4: Case study: Neutrophil subtype resolution in developmental liver. CellMaster recommends sub-clustering Neutrophils, proposes developmental stage markers (LCN2, LTF, MMP9), and provides rationales for immature, intermediate, and mature neutrophil assignments. Final annotations are merged back into the full dataset.
Figure 5: Complete CellMaster web interface example. Screenshot showing an annotation session on the developmental Liver dataset. The interface includes: (top) pipeline progress indicator and controls; (upper left) input panel for data upload and project description; (upper right) analysis results with hypothesis, marker genes, and iteration summary tabs; (lower left) gene expression dot plot; (lower right) interactive UMAP with cluster selection and sub-clustering controls; (bottom) chat interface for human-in-the-loop feedback.
...and 2 more figures

CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis

TL;DR

Abstract

CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (7)