Table of Contents
Fetching ...

Improving Prediction Performance and Model Interpretability through Attention Mechanisms from Basic and Applied Research Perspectives

Shunsuke Kitada

TL;DR

This work tackles the interpretability gap in deep learning by exploring attention mechanisms as a means to both boost prediction performance and provide meaningful explanations. It introduces adversarial training for attention (AT) and its interpretable variant (iAT), along with a semi-supervised extension (VAT and iVAT), demonstrating improved accuracy and alignment with human word-level explanations across NLP tasks and public datasets. In applied research, the dissertation develops multi-task and conditional-attention frameworks for ad conversion prediction and a hazard-function approach for ad discontinuation, enabling word-level interpretability and real-world operational support at scale. Overall, the study argues that attention-based explanations can be robustly integrated into real-world systems, enhancing trust and decision-making in domains ranging from language processing to computational advertising.

Abstract

With the dramatic advances in deep learning technology, machine learning research is focusing on improving the interpretability of model predictions as well as prediction performance in both basic and applied research. While deep learning models have much higher prediction performance than traditional machine learning models, the specific prediction process is still difficult to interpret and/or explain. This is known as the black-boxing of machine learning models and is recognized as a particularly important problem in a wide range of research fields, including manufacturing, commerce, robotics, and other industries where the use of such technology has become commonplace, as well as the medical field, where mistakes are not tolerated. This bulletin is based on the summary of the author's dissertation. The research summarized in the dissertation focuses on the attention mechanism, which has been the focus of much attention in recent years, and discusses its potential for both basic research in terms of improving prediction performance and interpretability, and applied research in terms of evaluating it for real-world applications using large data sets beyond the laboratory environment. The dissertation also concludes with a summary of the implications of these findings for subsequent research and future prospects in the field.

Improving Prediction Performance and Model Interpretability through Attention Mechanisms from Basic and Applied Research Perspectives

TL;DR

This work tackles the interpretability gap in deep learning by exploring attention mechanisms as a means to both boost prediction performance and provide meaningful explanations. It introduces adversarial training for attention (AT) and its interpretable variant (iAT), along with a semi-supervised extension (VAT and iVAT), demonstrating improved accuracy and alignment with human word-level explanations across NLP tasks and public datasets. In applied research, the dissertation develops multi-task and conditional-attention frameworks for ad conversion prediction and a hazard-function approach for ad discontinuation, enabling word-level interpretability and real-world operational support at scale. Overall, the study argues that attention-based explanations can be robustly integrated into real-world systems, enhancing trust and decision-making in domains ranging from language processing to computational advertising.

Abstract

With the dramatic advances in deep learning technology, machine learning research is focusing on improving the interpretability of model predictions as well as prediction performance in both basic and applied research. While deep learning models have much higher prediction performance than traditional machine learning models, the specific prediction process is still difficult to interpret and/or explain. This is known as the black-boxing of machine learning models and is recognized as a particularly important problem in a wide range of research fields, including manufacturing, commerce, robotics, and other industries where the use of such technology has become commonplace, as well as the medical field, where mistakes are not tolerated. This bulletin is based on the summary of the author's dissertation. The research summarized in the dissertation focuses on the attention mechanism, which has been the focus of much attention in recent years, and discusses its potential for both basic research in terms of improving prediction performance and interpretability, and applied research in terms of evaluating it for real-world applications using large data sets beyond the laboratory environment. The dissertation also concludes with a summary of the implications of these findings for subsequent research and future prospects in the field.
Paper Structure (16 sections, 6 equations, 4 figures)

This paper contains 16 sections, 6 equations, 4 figures.

Figures (4)

  • Figure 1: An example of an attention heatmap for a BiRNN model with attention mechanisms and the model with attention mechanisms trained with adversarial training from the Stanford Sentiment Treebank (SST) socher2013recursive. The proposed adversarial training for attention mechanisms helps the model learn cleaner attention.
  • Figure 2: Intuitive illustration of the proposed VAT for attention mechanisms. Our technique can learn clearer attention by overcoming adversarial perturbations $\bm{r}_{\texttt{VAT}}$, thereby improving model interpretability
  • Figure 3: Outline of the proposed framework. In the framework, we propose two strategies: multi-task learning, which simultaneously predicts conversions and clicks, and a conditional attention mechanism, which detects important representations in ad creative text according to the text's attributes.
  • Figure 4: Outline of our framework that exploits a hazard function, which draws on the idea of survival prediction, to predict the discontinuation of ad creatives. The input includes the four types of features: text, categorical, image, and numerical features. The output is the hazard probability, which includes whether the target ad creative has been discontinued in each time interval.