Table of Contents
Fetching ...

Language Model Meets Prototypes: Towards Interpretable Text Classification Models through Prototypical Networks

Ximing Wen

TL;DR

This work addresses interpretability in transformer-based NLP by integrating prototype networks with LM encoders. It first develops a Transformer-prototype model for contextual sarcasm detection that uses semantic prototypes and an incongruity loss to capture explicit versus implicit sentiment. It then introduces a white-box Graph Attention prototype network for text classification, achieving competitive accuracy with interpretable prototype-based explanations and visualizations. Finally, it outlines extending the framework to document classification via graph neural networks and contrastive learning to strengthen interpretability without sacrificing performance.

Abstract

Pretrained transformer-based Language Models (LMs) are well-known for their ability to achieve significant improvement on NLP tasks, but their black-box nature, which leads to a lack of interpretability, has been a major concern. My dissertation focuses on developing intrinsically interpretable models when using LMs as encoders while maintaining their superior performance via prototypical networks. I initiated my research by investigating enhancements in performance for interpretable models of sarcasm detection. My proposed approach focuses on capturing sentiment incongruity to enhance accuracy while offering instance-based explanations for the classification decisions. Later, I developed a novel white-box multi-head graph attention-based prototype network designed to explain the decisions of text classification models without sacrificing the accuracy of the original black-box LMs. In addition, I am working on extending the attention-based prototype network with contrastive learning to redesign an interpretable graph neural network, aiming to enhance both the interpretability and performance of the model in document classification.

Language Model Meets Prototypes: Towards Interpretable Text Classification Models through Prototypical Networks

TL;DR

This work addresses interpretability in transformer-based NLP by integrating prototype networks with LM encoders. It first develops a Transformer-prototype model for contextual sarcasm detection that uses semantic prototypes and an incongruity loss to capture explicit versus implicit sentiment. It then introduces a white-box Graph Attention prototype network for text classification, achieving competitive accuracy with interpretable prototype-based explanations and visualizations. Finally, it outlines extending the framework to document classification via graph neural networks and contrastive learning to strengthen interpretability without sacrificing performance.

Abstract

Pretrained transformer-based Language Models (LMs) are well-known for their ability to achieve significant improvement on NLP tasks, but their black-box nature, which leads to a lack of interpretability, has been a major concern. My dissertation focuses on developing intrinsically interpretable models when using LMs as encoders while maintaining their superior performance via prototypical networks. I initiated my research by investigating enhancements in performance for interpretable models of sarcasm detection. My proposed approach focuses on capturing sentiment incongruity to enhance accuracy while offering instance-based explanations for the classification decisions. Later, I developed a novel white-box multi-head graph attention-based prototype network designed to explain the decisions of text classification models without sacrificing the accuracy of the original black-box LMs. In addition, I am working on extending the attention-based prototype network with contrastive learning to redesign an interpretable graph neural network, aiming to enhance both the interpretability and performance of the model in document classification.

Paper Structure

This paper contains 11 sections, 1 figure.

Figures (1)

  • Figure 1: Illustration of prototype architecture for text classification