Advancing Interpretability in Text Classification through Prototype Learning

Bowen Wei; Ziwei Zhu

Advancing Interpretability in Text Classification through Prototype Learning

Bowen Wei, Ziwei Zhu

TL;DR

ProtoLens is a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification and outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks.

Abstract

Deep neural networks have achieved remarkable performance in various text-based tasks but often lack interpretability, making them less suitable for applications where transparency is critical. To address this, we propose ProtoLens, a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification. ProtoLens uses a Prototype-aware Span Extraction module to identify relevant text spans associated with learned prototypes and a Prototype Alignment mechanism to ensure prototypes are semantically meaningful throughout training. By aligning the prototype embeddings with human-understandable examples, ProtoLens provides interpretable predictions while maintaining competitive accuracy. Extensive experiments demonstrate that ProtoLens outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks. Code and data are available at \url{https://anonymous.4open.science/r/ProtoLens-CE0B/}.

Advancing Interpretability in Text Classification through Prototype Learning

TL;DR

Abstract

Advancing Interpretability in Text Classification through Prototype Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)