PCEvE: Part Contribution Evaluation Based Model Explanation for Human Figure Drawing Assessment and Beyond

Jongseo Lee; Geo Ahn; Seong Tae Kim; Jinwoo Choi

PCEvE: Part Contribution Evaluation Based Model Explanation for Human Figure Drawing Assessment and Beyond

Jongseo Lee, Geo Ahn, Seong Tae Kim, Jinwoo Choi

TL;DR

PCEvE introduces a part-based explanation framework for human figure drawing assessment by quantifying each predefined body part's contribution to a model decision using Shapley Values. By evaluating all possible part-combinations ($2^K$) and aggregating results at sample-, class-, and task-level, it yields intuitive part contribution histograms that align with human perception and extend beyond pixel-level attributions. The approach is validated across ASD screening, SCAT, and Stanford Cars, with additional sanity checks showing robustness to annotation quality and applicability to fine-grained visual categorization. The method holds promise for more transparent, hierarchical explanations in clinical-aided assessment and beyond, with potential integration with language models for richer descriptive explanations.

Abstract

For automatic human figure drawing (HFD) assessment tasks, such as diagnosing autism spectrum disorder (ASD) using HFD images, the clarity and explainability of a model decision are crucial. Existing pixel-level attribution-based explainable AI (XAI) approaches demand considerable effort from users to interpret the semantic information of a region in an image, which can be often time-consuming and impractical. To overcome this challenge, we propose a part contribution evaluation based model explanation (PCEvE) framework. On top of the part detection, we measure the Shapley Value of each individual part to evaluate the contribution to a model decision. Unlike existing attribution-based XAI approaches, the PCEvE provides a straightforward explanation of a model decision, i.e., a part contribution histogram. Furthermore, the PCEvE expands the scope of explanations beyond the conventional sample-level to include class-level and task-level insights, offering a richer, more comprehensive understanding of model behavior. We rigorously validate the PCEvE via extensive experiments on multiple HFD assessment datasets. Also, we sanity-check the proposed method with a set of controlled experiments. Additionally, we demonstrate the versatility and applicability of our method to other domains by applying it to a photo-realistic dataset, the Stanford Cars.

PCEvE: Part Contribution Evaluation Based Model Explanation for Human Figure Drawing Assessment and Beyond

TL;DR

) and aggregating results at sample-, class-, and task-level, it yields intuitive part contribution histograms that align with human perception and extend beyond pixel-level attributions. The approach is validated across ASD screening, SCAT, and Stanford Cars, with additional sanity checks showing robustness to annotation quality and applicability to fine-grained visual categorization. The method holds promise for more transparent, hierarchical explanations in clinical-aided assessment and beyond, with potential integration with language models for richer descriptive explanations.

Abstract

Paper Structure (26 sections, 5 equations, 15 figures, 1 table)

This paper contains 26 sections, 5 equations, 15 figures, 1 table.

Introduction
Related Work
Attribution-based Model Explanation
Shapley Value
Human Figure Drawing Assessment
Concept-based and Part-based Model Explanation
Part Contribution Evaluation Based Model Explanation
Preliminaries: Shapley Value
PCEvE Framework Overview
Sample-Level PCEvE
Class-Level and Task-Level PCEvE
Results
Datasets
Implementation Details
Training
...and 11 more sections

Figures (15)

Figure 1: Why do we need part-based model explanations for HFD assessment? We show a motivating example from a drawer gender classification task to highlight the contrast between the existing XAI approaches and the proposed approach. (a) The existing attribution-based XAI approaches visualize pixel-level attribution maps. However, these pixel-level attributions require users to infer which particular part is crucial in recognizing the image as 'Female' for a model. This step demands a level of interpretation that might not be immediately intuitive. (b) In contrast, our part contribution evaluation based model explanation (PCEvE) furnishes users with more direct and interpretable insights into model decisions. The PCEvE provides a part contribution histogram that eliminates the need for inference on which parts are crucial. (c) Furthermore, the PCEvE extends to provide more abstract-level insights, including class-level and task-level explanations. With sample-level, class-level, and task-level model explanations, we can understand model decision processes across various dimensions.
Figure 2: Analogy between PCEvE and game theory. (a) Game theory: by the coalition, players obtain a certain amount of overall gain. The Shapley Value ensures a fair distribution of payoff among players. In this example, player D contributes the most and receives the highest reward. (b) PCEvE: In this example, each body part corresponds to an individual player in (a). Each body part contributes to a model prediction, and the Shapley Value allows us to quantify the contribution of each body part.
Figure 3: Overview of PCEvE. The PCEvE explains a model decision by providing part contribution statistics at a sample/class/task level. (a) Sample-level PCEvE: given an input image with $K$ parts, we generate $2^K$ images by masking each part to obtain a set of all possible part combinations: $\mathbb{X}$. To evaluate the contribution of each part, the S-PCEvE aggregates the logit vectors of every image in $\mathbb{X}$ predicted by a model, resulting in a sample-level part contribution histogram. (b) Given a target class, e.g., 'Female', the C-PCEvE counts the most significant part for every image belonging to the class resulting in a class-level part contribution histogram. (c) The T-PCEvE accumulates the class-level part contribution histograms of all classes in the task. The T-PCEvE gives the task-level part contribution histogram, providing a model explanation at a task level.
Figure 4: Part combination image set generation process. We illustrate the process for generating $\mathbb{X}$, a comprehensive image set consisting of all possible combinations of parts. For clarity, we assume only three parts are of interest ($K=3$): 'Hair', 'Eye', and 'Nose'. (a) The table shows all eight (i.e., $2^K=8$) potential combinations derived from either including ('1') or excluding ('0') each of the three parts. (b) We visualize all part combination images to showcase how each image varies based on the presence or absence of specific parts. For instance, an image annotated as '101' contains 'Hair' and 'Nose' but omits 'Eye'. For the omitted part region, we fill in the average pixel value of the input image. The collection of the eight generated images forms the set $\mathbb{X}$.
Figure 5: Sample-level part-based model explanation on the ASD Screening and SCAT datasets. We show model explanations using GradCAM selvaraju2017gradcam and the PCEvE on (a) an 'ASD' sample and (b) a 'TD' sample of the ASD Screening jongmin2022autism dataset, (c) a 'Male' sample, and (d) a 'Female' sample of the SCAT-Drawing dataset, respectively. We normalize each value in the histogram by the maximum value within the sample.
...and 10 more figures

PCEvE: Part Contribution Evaluation Based Model Explanation for Human Figure Drawing Assessment and Beyond

TL;DR

Abstract

PCEvE: Part Contribution Evaluation Based Model Explanation for Human Figure Drawing Assessment and Beyond

Authors

TL;DR

Abstract

Table of Contents

Figures (15)