Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

Zahra Khodagholi; Niloofar Yousefi

Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

Zahra Khodagholi, Niloofar Yousefi

TL;DR

This work introduces a pre-synthesis gate that validates saliency-based design guidance for siRNA efficacy predictors by measuring the perturbation sensitivity of top-attribution positions against composition-matched baselines, yielding a pass/fail criterion. It reveals two transfer failure modes—faithful-but-wrong and inverted saliency—demonstrating that explanations may not generalize across protocols, and shows that cross-dataset shifts can silently undermine deployment. To bolster interpretability, the authors propose BioPrior, a biology-informed regularizer that aligns model gradients with canonical siRNA design principles, improving saliency faithfulness with modest predictive gains. Across four benchmarks, 19/20 fold–dataset combinations pass the faithfulness test, and the approach yields actionable guidance for sequence edits when faithfulness holds. The work underscores the necessity of dataset-specific interpretability validation for explanation-guided therapeutic design and provides an open-source protocol to operationalize this practice.

Abstract

Saliency maps are increasingly used as \emph{design guidance} in siRNA efficacy prediction, yet attribution methods are rarely validated before motivating sequence edits. We introduce a \textbf{pre-synthesis gate}: a protocol for \emph{counterfactual sensitivity faithfulness} that tests whether mutating high-saliency positions changes model output more than composition-matched controls. Cross-dataset transfer reveals two failure modes that would otherwise go undetected: \emph{faithful-but-wrong} (saliency valid, predictions fail) and \emph{inverted saliency} (top-saliency edits less impactful than random). Strikingly, models trained on mRNA-level assays collapse on a luciferase reporter dataset, demonstrating that protocol shifts can silently invalidate deployment. Across four benchmarks, 19/20 fold instances pass; the single failure shows inverted saliency. A biology-informed regularizer (BioPrior) strengthens saliency faithfulness with modest, dataset-dependent predictive trade-offs. Our results establish saliency validation as essential pre-deployment practice for explanation-guided therapeutic design. Code is available at https://github.com/shadi97kh/BioPrior.

Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

TL;DR

Abstract

Paper Structure (128 sections, 1 theorem, 38 equations, 9 figures, 14 tables)

This paper contains 128 sections, 1 theorem, 38 equations, 9 figures, 14 tables.

Introduction
Contributions.
Related Work
siRNA efficacy prediction: from rules to deep learning.
Biology-informed regularization for sequence models.
Saliency methods and faithfulness validation.
Interpretability validation in sequence biology.
Background
Task and datasets.
Gradient-based saliency.
Faithfulness desideratum.
Faithfulness taxonomy.
Method
Overview.
Model Architecture
...and 113 more sections

Key Result

Corollary D.1

Given Claims 1 and 2, the combination of BioPrior module and perturbation testing provides:

Figures (9)

Figure 1: Positioning saliency validation in the lab-in-the-loop decision pipeline. In both therapeutic lead selection and functional genomics knockdown screens, researchers rely on predicted efficacy and position-level saliency ("important positions") to decide which siRNA sequences to synthesize or prioritize for experimental validation. However, explanation methods can appear plausible while failing basic perturbation tests, a risk that compounds under assay or protocol shift across laboratories, cell lines, and readout technologies. This paper introduces a standardized faithfulness check (expected-effect perturbations with a nucleotide-matched baseline) that practitioners can apply as a pre-synthesis gate before acting on saliency maps in a new dataset or experimental setting. When validation passes, saliency-guided decisions (sequence edits, candidate ranking) can be trusted; when it fails, predictions may still be useful but position-importance reasoning should be avoided. Downstream wet-lab validation and clinical development (dashed region) are outside the scope of this work.
Figure 2: Overview of training and saliency faithfulness.(a) A hybrid Conv--BiLSTM--Transformer with dual siRNA$\leftrightarrow$mRNA cross-attention predicts efficacy from sequence encodings plus RNA-FM and thermodynamic features, regularized by BioPrior constraints weighted by a schedule $\lambda(t)$. (b) Saliency is validated by perturbing top-$k$ salient siRNA positions (A/U/G/C channels only) and comparing the expected prediction change to a nucleotide-matched random baseline.
Figure 3: Perturbation validation across datasets ($k=3$). Each panel shows results from a single representative fold; 5-fold statistics are in Table \ref{['tab:faithfulness']}. Panel D shows position importance: Hu/Mix/Shabalina show 5$'$ terminus (positions 1--4) dominance, while Taka peaks at positions 9--11, consistent with cross-dataset transfer failures.
Figure 4: Inter-dataset transfer faithfulness (6 representative pairs). Top row (a--c): Successful transfers among Hu/Mix/Shabalina show consistent 5$'$ terminus importance (positions 1–4, visible in Panel D of each subplot). All achieve high win rates ($>$70%) and positive effect sizes. Bottom row (d--f): Transfer failures involving Taka reveal two distinct modes. (d) Mix$\to$Taka: Prediction fails (AUC=0.497) but saliency remains faithful ($d_z$=1.47), indicating the model attends to 5$'$ positions that do not determine efficacy in Taka. (e--f) Taka$\to$Hu and Taka$\to$Mix: Inverted saliency with importance shifted to positions 9–11; high-saliency positions are less predictive than random, with negative effect sizes.
Figure 5: The Taka transfer anomaly.(a) When Taka is used as the training source, test AUC on all other datasets decreases during training; the untrained model (epoch 0) transfers better than the trained model (best epoch = 0 for all three targets). (b) When Taka is the target, all source models fail to exceed chance-level AUC regardless of training duration. (c) Best epoch across all 11 mechanistic transfer pairs: successful transfers (Shab$\to$Mix, Mix$\to$Hu, etc.) converge at intermediate epochs, while all Taka-involving pairs either never learn (epoch 0) or train to exhaustion (epoch 299) without meaningful improvement. This asymmetry confirms that the Taka incompatibility is systematic and bidirectional.
...and 4 more figures

Theorems & Definitions (5)

Claim B.1: BioPrior Differentiability and Guidance
Definition C.1: Expected-Effect
Definition C.2: Matched Random Set
Claim C.1: Perturbation Test Faithfulness
Corollary D.1: BioPrior-Perturbation Integration

Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

TL;DR

Abstract

Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (5)