ILLC: Iterative Layer-by-Layer Compression for Enhancing Structural Faithfulness in SpArX
Ungsik Kim
TL;DR
This work tackles the challenge of compressing deep neural networks without sacrificing structural fidelity or explanatory power. It introduces Iterative Layer-by-Layer Compression (ILLC), a layer-wise scheme that recalibrates each layer before compressing the next, achieving lower input-output and structural unfaithfulness while preserving the argumentative reasoning captured by the SparX/QBAF framework. Empirical results on the Breast Cancer Diagnosis dataset show consistent improvements across global and local explanations and reveal phenomena like Bi-Local-Maxima and Dead Neurons, which ILLC mitigates. The approach promises more compact, faithful models suitable for high-stakes domains where transparent internal reasoning is required, with potential extensions to other architectures and skip-connected networks.
Abstract
In the field of Explainable Artificial Intelligence (XAI), argumentative XAI approaches have been proposed to represent the internal reasoning process of deep neural networks in a more transparent way by interpreting hidden nodes as arguements. However, as the number of layers increases, existing compression methods simplify all layers at once, which lead to high accumulative information loss. To compensate for this, we propose an iterative layer-by-layer compression technique in which each layer is compressed separately and the reduction error in the next layer is immediately compensated for, thereby improving the overall input-output and structural fidelity of the model. Experiments on the Breast Cancer Diagnosis dataset show that, compared to traditional compression, the method reduces input-output and structural unfaithfulness, and maintains a more consistent attack-support relationship in the Argumentative Explanation scheme. This is significant because it provides a new way to make complex MLP models more compact while still conveying their internal inference logic without distortion.
