DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models
Vinay Kumar Sankarapu, Chintan Chitroda, Yashwardhan Rathore, Neeraj Kumar Singh, Pratinav Seth
TL;DR
This work tackles the opacity of deep learning by introducing DLBacktrace, a model-agnostic explainability method that traces relevance from output back to input across CNNs, transformers, and other architectures. It presents two modes, Default and Contrastive, and extends relevance propagation to attention layers, delivering deterministic, architecture-agnostic explanations with an accompanying PyTorch/TensorFlow library. Through benchmarking on tabular (Lending Club), image (CIFAR-10 with ResNet-34), and text (SST-2 with BERT) tasks, DLBacktrace demonstrates robust, informative explanations that improve interpretability and support trust and regulatory compliance, albeit with higher computational cost in some settings. The paper discusses practical implications for network analysis, feature importance, and fairness, and points to future work in scaling to complex transformers, multimodal models, and real-time deployment.
Abstract
The rapid growth of AI has led to more complex deep learning models, often operating as opaque "black boxes" with limited transparency in their decision-making. This lack of interpretability poses challenges, especially in high-stakes applications where understanding model output is crucial. This work highlights the importance of interpretability in fostering trust, accountability, and responsible deployment. To address these challenges, we introduce DLBacktrace, a novel, model-agnostic technique designed to provide clear insights into deep learning model decisions across a wide range of domains and architectures, including MLPs, CNNs, and Transformer-based LLM models. We present a comprehensive overview of DLBacktrace and benchmark its performance against established interpretability methods such as SHAP, LIME, and GradCAM. Our results demonstrate that DLBacktrace effectively enhances understanding of model behavior across diverse tasks. DLBacktrace is compatible with models developed in both PyTorch and TensorFlow, supporting architectures such as BERT, ResNet, U-Net, and custom DNNs for tabular data. The library is open-sourced and available at https://github.com/AryaXAI/DLBacktrace .
