GRIT: Graph-Regularized Logit Refinement for Zero-shot Cell Type Annotation
Tianxiang Hu, Chenyi Zhou, Jiaxiang Liu, Jiongxin Wang, Ruizhe Chen, Haoxiang Xia, Gaoang Wang, Jian Wu, Zuozhu Liu
TL;DR
GRIT provides a training-free, inference-time refinement for zero-shot cell type annotation by enforcing local consistency of CLIP-style logits on a PCA-based $k$-NN graph. The method solves a convex graph-regularized objective with closed-form $\hat{P}_{\lambda} = (I + \lambda L)^{-1} P_0$, blending scalable foundation-model predictions with graph-structured robustness. Across 14 scRNA-seq datasets (11 Tabula Sapiens organs plus PBMCs and Peripheral Cortex), GRIT yields consistent accuracy gains up to $\approx 10\%$ and macro F1 improvements, while remaining robust to hyperparameters and graph choices. This lightweight, model-agnostic post-processing step enhances zero-shot annotation without additional training, offering a practical plug-in for scalable cell type inference in single-cell analyses.
Abstract
Cell type annotation is a fundamental step in the analysis of single-cell RNA sequencing (scRNA-seq) data. In practice, human experts often rely on the structure revealed by principal component analysis (PCA) followed by $k$-nearest neighbor ($k$-NN) graph construction to guide annotation. While effective, this process is labor-intensive and does not scale to large datasets. Recent advances in CLIP-style models offer a promising path toward automating cell type annotation. By aligning scRNA-seq profiles with natural language descriptions, models like LangCell enable zero-shot annotation. While LangCell demonstrates decent zero-shot performance, its predictions remain suboptimal. In this paper, we propose a principled inference-time paradigm for zero-shot cell type annotation (GRIT) which bridges the scalability of pre-trained foundation models with the structural robustness relied upon in human expert annotation workflows. Specifically, we enforce local consistency of the zero-shot CLIP logits over the task-specific PCA-based $k$-NN graph. We evaluate our approach on 14 annotated human scRNA-seq datasets from 4 distinct studies, spanning 11 organs and over 200,000 single cells. Our method consistently improves zero-shot annotation accuracy, achieving accuracy gains of up to 10\%. Further analysis showcase the mechanism by which GRIT effectively propagates correct signals through the graph, pulling back mislabeled cells toward more accurate predictions. The method is training-free, model-agnostic, and serves as a simple yet effective plug-in for enhancing zero-shot cell type annotation.
