Table of Contents
Fetching ...

CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody Design

Mansoor Ahmed, Nadeem Taj, Imdad Ullah Khan, Hemanth Venkateswara, Murray Patterson

Abstract

Computational antibody design has seen rapid methodological progress, with dozens of deep generative methods proposed in the past three years, yet the field lacks a standardized benchmark for fair comparison and model development. These methods are evaluated on different SAbDab snapshots, non-overlapping test sets, and incompatible metrics, and the literature fragments the design problem into numerous sub-tasks with no common definition. We introduce \textsc{Chimera-Bench} (\textbf{C}DR \textbf{M}odeling with \textbf{E}pitope-guided \textbf{R}edesign), a unified benchmark built around a single canonical task: \emph{epitope-conditioned CDR sequence-structure co-design}. \textsc{Chimera-Bench} provides (1) a curated, deduplicated dataset of \textbf{2,922} antibody-antigen complexes with epitope and paratope annotations; (2) three biologically motivated splits testing generalization to unseen epitopes, unseen antigen folds, and prospective temporal targets; and (3) a comprehensive evaluation protocol with five metric groups including novel epitope-specificity measures. We benchmark representative methods spanning different generative paradigms and report results across all splits. \textsc{Chimera-Bench} is the largest dataset of its kind for the antibody design problem, allowing the community to develop and test novel methods and evaluate their generalizability. The source code and data are available at: https://github.com/mansoor181/chimera-bench.git

CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody Design

Abstract

Computational antibody design has seen rapid methodological progress, with dozens of deep generative methods proposed in the past three years, yet the field lacks a standardized benchmark for fair comparison and model development. These methods are evaluated on different SAbDab snapshots, non-overlapping test sets, and incompatible metrics, and the literature fragments the design problem into numerous sub-tasks with no common definition. We introduce \textsc{Chimera-Bench} (\textbf{C}DR \textbf{M}odeling with \textbf{E}pitope-guided \textbf{R}edesign), a unified benchmark built around a single canonical task: \emph{epitope-conditioned CDR sequence-structure co-design}. \textsc{Chimera-Bench} provides (1) a curated, deduplicated dataset of \textbf{2,922} antibody-antigen complexes with epitope and paratope annotations; (2) three biologically motivated splits testing generalization to unseen epitopes, unseen antigen folds, and prospective temporal targets; and (3) a comprehensive evaluation protocol with five metric groups including novel epitope-specificity measures. We benchmark representative methods spanning different generative paradigms and report results across all splits. \textsc{Chimera-Bench} is the largest dataset of its kind for the antibody design problem, allowing the community to develop and test novel methods and evaluate their generalizability. The source code and data are available at: https://github.com/mansoor181/chimera-bench.git
Paper Structure (51 sections, 6 equations, 7 figures, 11 tables)

This paper contains 51 sections, 6 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: The Chimera-Bench data curation pipeline that collects antigen-antibody complexes from SAbDab and produces filtered and annotated complexes for the antibody design task.
  • Figure 2: Radar plots comparing methods across metrics for CDR-H1, H2, and H3 on the epitope-group split. Six metrics (AAR, CAAR, TM, Fnat, DockQ, EpiF1) are shown on their native $[0,1]$ scale; RMSD and iRMSD are transformed via $1/(1{+}x)$ so that higher values indicate better performance on all axes.
  • Figure 3: Dataset size distributions. (a) Chain lengths for heavy, light, and antigen chains. (b) Interface statistics: epitope size, paratope size, and number of contact pairs per complex.
  • Figure 4: CDR length distributions under IMGT numbering. CDR-H3 shows the widest variability, consistent with its role as the primary determinant of antigen specificity.
  • Figure 5: Antigen species and origin distributions. (a) Top 15 species by complex count. (b) Breakdown by biological origin category.
  • ...and 2 more figures