Table of Contents
Fetching ...

Insights Into the Inner Workings of Transformer Models for Protein Function Prediction

Markus Wenzel, Erik Grüner, Nils Strodthoff

TL;DR

The paper addresses the interpretability of large transformer models trained for protein function prediction by extending integrated gradients to inspect latent representations and head-level contributions. It finetunes ProtBert, ProtT5, and ESM-2 on GO and EC tasks and uses a head-specific IG framework to attribute residue-level relevance to amino acids, then statistically correlates these attributions with UniProt/PROSITE annotations. The authors demonstrate embedding- and head-level explanations that align with known biological features (e.g., membrane regions, active sites) and reveal specialized heads and collective dynamics across layers. This work provides quantitative evidence for the meaningfulness of attribution maps in proteomics, suggests pathways for model validation and pruning, and offers code to reproduce the XAI analyses on protein sequence data.

Abstract

Motivation: We explored how explainable artificial intelligence (XAI) can help to shed light into the inner workings of neural networks for protein function prediction, by extending the widely used XAI method of integrated gradients such that latent representations inside of transformer models, which were finetuned to Gene Ontology term and Enzyme Commission number prediction, can be inspected too. Results: The approach enabled us to identify amino acids in the sequences that the transformers pay particular attention to, and to show that these relevant sequence parts reflect expectations from biology and chemistry, both in the embedding layer and inside of the model, where we identified transformer heads with a statistically significant correspondence of attribution maps with ground truth sequence annotations (e.g. transmembrane regions, active sites) across many proteins. Availability and Implementation: Source code can be accessed at https://github.com/markuswenzel/xai-proteins .

Insights Into the Inner Workings of Transformer Models for Protein Function Prediction

TL;DR

The paper addresses the interpretability of large transformer models trained for protein function prediction by extending integrated gradients to inspect latent representations and head-level contributions. It finetunes ProtBert, ProtT5, and ESM-2 on GO and EC tasks and uses a head-specific IG framework to attribute residue-level relevance to amino acids, then statistically correlates these attributions with UniProt/PROSITE annotations. The authors demonstrate embedding- and head-level explanations that align with known biological features (e.g., membrane regions, active sites) and reveal specialized heads and collective dynamics across layers. This work provides quantitative evidence for the meaningfulness of attribution maps in proteomics, suggests pathways for model validation and pruning, and offers code to reproduce the XAI analyses on protein sequence data.

Abstract

Motivation: We explored how explainable artificial intelligence (XAI) can help to shed light into the inner workings of neural networks for protein function prediction, by extending the widely used XAI method of integrated gradients such that latent representations inside of transformer models, which were finetuned to Gene Ontology term and Enzyme Commission number prediction, can be inspected too. Results: The approach enabled us to identify amino acids in the sequences that the transformers pay particular attention to, and to show that these relevant sequence parts reflect expectations from biology and chemistry, both in the embedding layer and inside of the model, where we identified transformer heads with a statistically significant correspondence of attribution maps with ground truth sequence annotations (e.g. transmembrane regions, active sites) across many proteins. Availability and Implementation: Source code can be accessed at https://github.com/markuswenzel/xai-proteins .
Paper Structure (22 sections, 1 equation, 12 figures, 5 tables)

This paper contains 22 sections, 1 equation, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Illustration of the experimental design. Top: From the amino acid sequence, the finetuned transformer model infers the applicable Gene Ontology (GO) terms (represented as multi-label class membership vector). (The depicted exemplary "catalase-3" should be labeled with the GO terms "catalase activity" as "molecular function", "response to hydrogen peroxide" as "biological process", "cytoplasm" as "cellular component" etc.; about 5K of about 45K GO terms were considered.) Center: Relevance indicative for a selected GO term was attributed to the amino acids per protein and correlated with corresponding annotations per amino acid. This correlation between relevance attributions and annotations was then statistically assessed across the test data set proteins. The analysis was conducted for the embedding layer and "inside" of the model, for each head in each layer, and was repeated for different GO terms (see \ref{['subsec:revealing']}). Bottom: Specific amino acids of a protein are annotated in sequence databases like UniProt, because they serve as binding or active sites or are located in the cell membrane etc. Active sites can, e.g. be found at the histidine ("H" at position 65) and asparagine ("N" at position 138) of "https://alphafold.ebi.ac.uk/entry/O48560" (protein structure prediction created by https://alphafold.com/ -- "AlphaFold Data Copyright (2022) DeepMind Technologies Limited" -- under the https://creativecommons.org/licenses/by/4.0/jumper2021highlyvaradi2021alphafold).
  • Figure 2: Visualization of the explainability method based on IG that can attribute relevance to sequence tokens (here: amino acids) separately for each head and layer of the transformer vaswani2017attention.
  • Figure 3: Attribution maps for the embedding layer of ProtBert finetuned to GO term classification were correlated with sequence annotations. Relevance attributions indicative for the GO label "membrane" correlate significantly with UniProt annotations as "transmembrane regions" (p$<$0.05, i.e. above blue line). Attribution-annotation-correlation was not observed for GO "catalytic activity" and "binding'. Numbers of test split samples both labeled with the GO term and annotated per amino acid are listed below the x-axis.
  • Figure 4: Attribution maps calculated for the embedding layer of ProtBert finetuned to "EC50 level L1" classification were correlated with UniProt sequence annotations. Left: Relevance attributions correlated significantly (p$<$0.05, i.e. above blue line) with "active sites" and "binding sites" for five out of six EC classes, and with "transmembrane regions" and "short sequence motifs" for two, respectively, three EC classes each. Right: Numbers of annotated samples in the test split per annotation type and EC class.
  • Figure 5: Inside ProtBert; GO "membrane" (GO:0016020). Left: The relevance attribution (along the sequences) indicative for the GO term "membrane" was correlated with UniProt annotations as "transmembrane regions", for each transformer head and layer. Biserial correlation coefficients (r), obtained for each attribution-annotation-pair, were aggregated in population statistics with Wilcoxon signed-rank tests. The resulting p-values of the tests were adjusted with the Benjamini/Hochberg method for the multiple hypothesis tests conducted in order to limit the false discovery rate. A significance threshold was applied (family-wise error rate of 0.05). The negative logarithm of the corrected and thresholded p-values is displayed. All colored pixels indicate statistically significant results. Center: ProtBert heads with a sig. positive relevance (sum along the sequence; indicative for the GO term "membrane") were singled out with the Wilcoxon signed-rank test. The matrix plots show the negative logarithm of the resulting p-values (adjusted with Benjamini/Hochberg and a threshold). Right: ProtBert heads with a sig. positive attribution-annotation-correlation (p-values from Wilcoxon signed-rank tests plotted) that are also characterized by a sig. positive relevance (the latter overlaid as mask). Only results for UniProt "transmembrane regions" are shown, omitting the results for "active/binding sites", "motifs", and PROSITE patterns, which did not feature heads with both a sig. positive relevance and attribution-annotation-correlation.
  • ...and 7 more figures