Table of Contents
Fetching ...

Advancing Interpretability in Text Classification through Prototype Learning

Bowen Wei, Ziwei Zhu

TL;DR

ProtoLens is a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification and outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks.

Abstract

Deep neural networks have achieved remarkable performance in various text-based tasks but often lack interpretability, making them less suitable for applications where transparency is critical. To address this, we propose ProtoLens, a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification. ProtoLens uses a Prototype-aware Span Extraction module to identify relevant text spans associated with learned prototypes and a Prototype Alignment mechanism to ensure prototypes are semantically meaningful throughout training. By aligning the prototype embeddings with human-understandable examples, ProtoLens provides interpretable predictions while maintaining competitive accuracy. Extensive experiments demonstrate that ProtoLens outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks. Code and data are available at \url{https://anonymous.4open.science/r/ProtoLens-CE0B/}.

Advancing Interpretability in Text Classification through Prototype Learning

TL;DR

ProtoLens is a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification and outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks.

Abstract

Deep neural networks have achieved remarkable performance in various text-based tasks but often lack interpretability, making them less suitable for applications where transparency is critical. To address this, we propose ProtoLens, a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification. ProtoLens uses a Prototype-aware Span Extraction module to identify relevant text spans associated with learned prototypes and a Prototype Alignment mechanism to ensure prototypes are semantically meaningful throughout training. By aligning the prototype embeddings with human-understandable examples, ProtoLens provides interpretable predictions while maintaining competitive accuracy. Extensive experiments demonstrate that ProtoLens outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks. Code and data are available at \url{https://anonymous.4open.science/r/ProtoLens-CE0B/}.

Paper Structure

This paper contains 32 sections, 15 equations, 16 figures, 2 tables.

Figures (16)

  • Figure 1: Interpretable classification by ProtoLens.
  • Figure 2: Model Structure.
  • Figure 3: Sampled Aligned interpretation of prototypes with corresponding text sentences.
  • Figure 4: Case study of a positive class text instance.
  • Figure 5: Case study of a negative class text instance.
  • ...and 11 more figures