Improving Prediction Performance and Model Interpretability through Attention Mechanisms from Basic and Applied Research Perspectives
Shunsuke Kitada
TL;DR
This work tackles the interpretability gap in deep learning by exploring attention mechanisms as a means to both boost prediction performance and provide meaningful explanations. It introduces adversarial training for attention (AT) and its interpretable variant (iAT), along with a semi-supervised extension (VAT and iVAT), demonstrating improved accuracy and alignment with human word-level explanations across NLP tasks and public datasets. In applied research, the dissertation develops multi-task and conditional-attention frameworks for ad conversion prediction and a hazard-function approach for ad discontinuation, enabling word-level interpretability and real-world operational support at scale. Overall, the study argues that attention-based explanations can be robustly integrated into real-world systems, enhancing trust and decision-making in domains ranging from language processing to computational advertising.
Abstract
With the dramatic advances in deep learning technology, machine learning research is focusing on improving the interpretability of model predictions as well as prediction performance in both basic and applied research. While deep learning models have much higher prediction performance than traditional machine learning models, the specific prediction process is still difficult to interpret and/or explain. This is known as the black-boxing of machine learning models and is recognized as a particularly important problem in a wide range of research fields, including manufacturing, commerce, robotics, and other industries where the use of such technology has become commonplace, as well as the medical field, where mistakes are not tolerated. This bulletin is based on the summary of the author's dissertation. The research summarized in the dissertation focuses on the attention mechanism, which has been the focus of much attention in recent years, and discusses its potential for both basic research in terms of improving prediction performance and interpretability, and applied research in terms of evaluating it for real-world applications using large data sets beyond the laboratory environment. The dissertation also concludes with a summary of the implications of these findings for subsequent research and future prospects in the field.
