Table of Contents
Fetching ...

Interpreting Inflammation Prediction Model via Tag-based Cohort Explanation

Fanyu Meng, Jules Larke, Xin Liu, Zhaodan Kong, Xin Chen, Danielle Lemay, Ilias Tagkopoulos

TL;DR

This work proposes a novel framework for identifying cohorts within a dataset based on local feature importance scores, aiming to generate concise descriptions of the clusters via tags, and demonstrates that the framework can generate reliable explanations that match domain knowledge.

Abstract

Machine learning is revolutionizing nutrition science by enabling systems to learn from data and make intelligent decisions. However, the complexity of these models often leads to challenges in understanding their decision-making processes, necessitating the development of explainability techniques to foster trust and increase model transparency. An under-explored type of explanation is cohort explanation, which provides explanations to groups of instances with similar characteristics. Unlike traditional methods that focus on individual explanations or global model behavior, cohort explainability bridges the gap by providing unique insights at an intermediate granularity. We propose a novel framework for identifying cohorts within a dataset based on local feature importance scores, aiming to generate concise descriptions of the clusters via tags. We evaluate our framework on a food-based inflammation prediction model and demonstrated that the framework can generate reliable explanations that match domain knowledge.

Interpreting Inflammation Prediction Model via Tag-based Cohort Explanation

TL;DR

This work proposes a novel framework for identifying cohorts within a dataset based on local feature importance scores, aiming to generate concise descriptions of the clusters via tags, and demonstrates that the framework can generate reliable explanations that match domain knowledge.

Abstract

Machine learning is revolutionizing nutrition science by enabling systems to learn from data and make intelligent decisions. However, the complexity of these models often leads to challenges in understanding their decision-making processes, necessitating the development of explainability techniques to foster trust and increase model transparency. An under-explored type of explanation is cohort explanation, which provides explanations to groups of instances with similar characteristics. Unlike traditional methods that focus on individual explanations or global model behavior, cohort explainability bridges the gap by providing unique insights at an intermediate granularity. We propose a novel framework for identifying cohorts within a dataset based on local feature importance scores, aiming to generate concise descriptions of the clusters via tags. We evaluate our framework on a food-based inflammation prediction model and demonstrated that the framework can generate reliable explanations that match domain knowledge.

Paper Structure

This paper contains 32 sections, 5 equations, 10 figures, 1 algorithm.

Figures (10)

  • Figure 1: The data distribution of a hypothetical medical disorder classification problem.
  • Figure 2: Different types of feature importance explanations for a hypothetical model.
  • Figure 3: The global and local SHAP importance of the inflammation prediction model. (a) Global importance; (b) distribution of local importance. In (b), we highlight a few food features whose importance distribution appears to have groupings. This motivates the use of cohort explanation to identify the members of those groupings.
  • Figure 4: Evaluating the quality of cohort explanation across different values of $k$. The shaded region depicts the standard deviation. (a) Compactness objective during optimization, which represents the mean inter-cohort pairwise distance on importance, averaged over all cohorts; (b) Descriptiveness objective during optimization, which is the number of tags used by the cohort with the fewest tags; (c) importance prediction error. Note that (b) does not have a shaded region since at all $k$, across all 5 folds, the descriptiveness objective yields the same result.
  • Figure 5: Cohort importance and their descriptions using TagHort with $k=4$. The title of each subplot also includes the tags that describe the cohort.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Definition 1: Tag-based cohort explanation
  • Definition 2: Importance prediction error