Knowledge-Aware Neuron Interpretation for Scene Classification
Yong Guan, Freddy Lecue, Jiaoyan Chen, Ru Li, Jeff Z. Pan
TL;DR
This work presents a knowledge-aware framework for neuron-level interpretation in scene classification by leveraging ConceptNet-derived core concepts (CC). It introduces MinMax-based NetDissect to map neurons to complete CC, Concept Filtering to fuse semantically equivalent concepts via KG embeddings, and Model Manipulation to verify and enhance model behavior using CC. Across ADE20k and Opensurfaces, the approach improves explanation quality and yields substantial model gains, with concept fusion achieving over 23% IoU improvements and CC-based retraining delivering meaningful accuracy boosts. The study demonstrates how external knowledge graphs can be integrated into neuron interpretation to both better explain decisions and actively improve model performance.
Abstract
Although neural models have achieved remarkable performance, they still encounter doubts due to the intransparency. To this end, model prediction explanation is attracting more and more attentions. However, current methods rarely incorporate external knowledge and still suffer from three limitations: (1) Neglecting concept completeness. Merely selecting concepts may not sufficient for prediction. (2) Lacking concept fusion. Failure to merge semantically-equivalent concepts. (3) Difficult in manipulating model behavior. Lack of verification for explanation on original model. To address these issues, we propose a novel knowledge-aware neuron interpretation framework to explain model predictions for image scene classification. Specifically, for concept completeness, we present core concepts of a scene based on knowledge graph, ConceptNet, to gauge the completeness of concepts. Our method, incorporating complete concepts, effectively provides better prediction explanations compared to baselines. Furthermore, for concept fusion, we introduce a knowledge graph-based method known as Concept Filtering, which produces over 23% point gain on neuron behaviors for neuron interpretation. At last, we propose Model Manipulation, which aims to study whether the core concepts based on ConceptNet could be employed to manipulate model behavior. The results show that core concepts can effectively improve the performance of original model by over 26%.
