GRIT: Graph-Regularized Logit Refinement for Zero-shot Cell Type Annotation

Tianxiang Hu; Chenyi Zhou; Jiaxiang Liu; Jiongxin Wang; Ruizhe Chen; Haoxiang Xia; Gaoang Wang; Jian Wu; Zuozhu Liu

GRIT: Graph-Regularized Logit Refinement for Zero-shot Cell Type Annotation

Tianxiang Hu, Chenyi Zhou, Jiaxiang Liu, Jiongxin Wang, Ruizhe Chen, Haoxiang Xia, Gaoang Wang, Jian Wu, Zuozhu Liu

TL;DR

GRIT provides a training-free, inference-time refinement for zero-shot cell type annotation by enforcing local consistency of CLIP-style logits on a PCA-based $k$-NN graph. The method solves a convex graph-regularized objective with closed-form $\hat{P}_{\lambda} = (I + \lambda L)^{-1} P_0$, blending scalable foundation-model predictions with graph-structured robustness. Across 14 scRNA-seq datasets (11 Tabula Sapiens organs plus PBMCs and Peripheral Cortex), GRIT yields consistent accuracy gains up to $\approx 10\%$ and macro F1 improvements, while remaining robust to hyperparameters and graph choices. This lightweight, model-agnostic post-processing step enhances zero-shot annotation without additional training, offering a practical plug-in for scalable cell type inference in single-cell analyses.

Abstract

Cell type annotation is a fundamental step in the analysis of single-cell RNA sequencing (scRNA-seq) data. In practice, human experts often rely on the structure revealed by principal component analysis (PCA) followed by $k$-nearest neighbor ($k$-NN) graph construction to guide annotation. While effective, this process is labor-intensive and does not scale to large datasets. Recent advances in CLIP-style models offer a promising path toward automating cell type annotation. By aligning scRNA-seq profiles with natural language descriptions, models like LangCell enable zero-shot annotation. While LangCell demonstrates decent zero-shot performance, its predictions remain suboptimal. In this paper, we propose a principled inference-time paradigm for zero-shot cell type annotation (GRIT) which bridges the scalability of pre-trained foundation models with the structural robustness relied upon in human expert annotation workflows. Specifically, we enforce local consistency of the zero-shot CLIP logits over the task-specific PCA-based $k$-NN graph. We evaluate our approach on 14 annotated human scRNA-seq datasets from 4 distinct studies, spanning 11 organs and over 200,000 single cells. Our method consistently improves zero-shot annotation accuracy, achieving accuracy gains of up to 10\%. Further analysis showcase the mechanism by which GRIT effectively propagates correct signals through the graph, pulling back mislabeled cells toward more accurate predictions. The method is training-free, model-agnostic, and serves as a simple yet effective plug-in for enhancing zero-shot cell type annotation.

GRIT: Graph-Regularized Logit Refinement for Zero-shot Cell Type Annotation

TL;DR

GRIT provides a training-free, inference-time refinement for zero-shot cell type annotation by enforcing local consistency of CLIP-style logits on a PCA-based

-NN graph. The method solves a convex graph-regularized objective with closed-form

, blending scalable foundation-model predictions with graph-structured robustness. Across 14 scRNA-seq datasets (11 Tabula Sapiens organs plus PBMCs and Peripheral Cortex), GRIT yields consistent accuracy gains up to

and macro F1 improvements, while remaining robust to hyperparameters and graph choices. This lightweight, model-agnostic post-processing step enhances zero-shot annotation without additional training, offering a practical plug-in for scalable cell type inference in single-cell analyses.

Abstract

-nearest neighbor (

-NN) graph construction to guide annotation. While effective, this process is labor-intensive and does not scale to large datasets. Recent advances in CLIP-style models offer a promising path toward automating cell type annotation. By aligning scRNA-seq profiles with natural language descriptions, models like LangCell enable zero-shot annotation. While LangCell demonstrates decent zero-shot performance, its predictions remain suboptimal. In this paper, we propose a principled inference-time paradigm for zero-shot cell type annotation (GRIT) which bridges the scalability of pre-trained foundation models with the structural robustness relied upon in human expert annotation workflows. Specifically, we enforce local consistency of the zero-shot CLIP logits over the task-specific PCA-based

-NN graph. We evaluate our approach on 14 annotated human scRNA-seq datasets from 4 distinct studies, spanning 11 organs and over 200,000 single cells. Our method consistently improves zero-shot annotation accuracy, achieving accuracy gains of up to 10\%. Further analysis showcase the mechanism by which GRIT effectively propagates correct signals through the graph, pulling back mislabeled cells toward more accurate predictions. The method is training-free, model-agnostic, and serves as a simple yet effective plug-in for enhancing zero-shot cell type annotation.

GRIT: Graph-Regularized Logit Refinement for Zero-shot Cell Type Annotation

TL;DR

Abstract

GRIT: Graph-Regularized Logit Refinement for Zero-shot Cell Type Annotation

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (2)