PepTriX: A Framework for Explainable Peptide Analysis through Protein Language Models

Vincent Schilling; Akshat Dubey; Georges Hattab

PepTriX: A Framework for Explainable Peptide Analysis through Protein Language Models

Vincent Schilling, Akshat Dubey, Georges Hattab

TL;DR

PepTriX tackles accuracy and interpretability in peptide classification by uniting 1D sequence context from ESM-2 with 3D structural context from ESMFold in a lightweight GAT framework. The model uses contrastive learning and cross-modal co-attention to align modalities and provide motif-level explanations without fine-tuning large PLMs. It achieves competitive performance across diverse datasets and delivers interpretable insights linking sequence features to structural determinants. The approach reduces computational barriers while offering domain experts actionable structural hypotheses for peptide design and discovery.

Abstract

Peptide classification tasks, such as predicting toxicity and HIV inhibition, are fundamental to bioinformatics and drug discovery. Traditional approaches rely heavily on handcrafted encodings of one-dimensional (1D) peptide sequences, which can limit generalizability across tasks and datasets. Recently, protein language models (PLMs), such as ESM-2 and ESMFold, have demonstrated strong predictive performance. However, they face two critical challenges. First, fine-tuning is computationally costly. Second, their complex latent representations hinder interpretability for domain experts. Additionally, many frameworks have been developed for specific types of peptide classification, lacking generalization. These limitations restrict the ability to connect model predictions to biologically relevant motifs and structural properties. To address these limitations, we present PepTriX, a novel framework that integrates one dimensional (1D) sequence embeddings and three-dimensional (3D) structural features via a graph attention network enhanced with contrastive training and cross-modal co-attention. PepTriX automatically adapts to diverse datasets, producing task-specific peptide vectors while retaining biological plausibility. After evaluation by domain experts, we found that PepTriX performs remarkably well across multiple peptide classification tasks and provides interpretable insights into the structural and biophysical motifs that drive predictions. Thus, PepTriX offers both predictive robustness and interpretable validation, bridging the gap between performance-driven peptide-level models (PLMs) and domain-level understanding in peptide research.

PepTriX: A Framework for Explainable Peptide Analysis through Protein Language Models

TL;DR

Abstract

PepTriX: A Framework for Explainable Peptide Analysis through Protein Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)