Emergent Explainability: Adding a causal chain to neural network inference

Adam Perrett

Emergent Explainability: Adding a causal chain to neural network inference

Adam Perrett

TL;DR

The paper addresses the opacity of neural networks in high-stakes settings like healthcare and proposes emergent communication (EmCom) to achieve causal explainability. A contextualiser (trained via reinforcement learning) and an actor (trained via supervised learning) exchange task-informed messages that accompany outputs, enabling a causal grounding of decisions. In synthetic experiments with $n=3$ inputs and $256$ truth tables (and $1024$ training examples), the study shows that greater cross-agent information sharing improves generalization to unseen tasks, supporting the viability of causal, task-aware explanations and potential for distributed learning and transfer. The work suggests significant implications for healthcare xAI and beyond, while noting the need for further validation and the development of human-interpretable communications and broader real-world testing.

Abstract

This position paper presents a theoretical framework for enhancing explainable artificial intelligence (xAI) through emergent communication (EmCom), focusing on creating a causal understanding of AI model outputs. We explore the novel integration of EmCom into AI systems, offering a paradigm shift from conventional associative relationships between inputs and outputs to a more nuanced, causal interpretation. The framework aims to revolutionize how AI processes are understood, making them more transparent and interpretable. While the initial application of this model is demonstrated on synthetic data, the implications of this research extend beyond these simple applications. This general approach has the potential to redefine interactions with AI across multiple domains, fostering trust and informed decision-making in healthcare and in various sectors where AI's decision-making processes are critical. The paper discusses the theoretical underpinnings of this approach, its potential broad applications, and its alignment with the growing need for responsible and transparent AI systems in an increasingly digital world.

Emergent Explainability: Adding a causal chain to neural network inference

TL;DR

inputs and

truth tables (and

training examples), the study shows that greater cross-agent information sharing improves generalization to unseen tasks, supporting the viability of causal, task-aware explanations and potential for distributed learning and transfer. The work suggests significant implications for healthcare xAI and beyond, while noting the need for further validation and the development of human-interpretable communications and broader real-world testing.

Abstract

Paper Structure (14 sections, 3 figures)

This paper contains 14 sections, 3 figures.

Introduction and background
Opportunities and challenges of explainability
Emergent communication
Emergent explainability
Methodology
Experimental design
Agent setup
Communication protocol
Training
Evaluation
Work in Progress - Human interpretability
Experiments and Results
Examining the effect of actor overlap
Discussion and Limitations

Figures (3)

Figure 1: The contextualiser network receives the task ID (e.g. task = is there a dog) and passes a message to the actor network. The actor network processes the message and an input to produce an output (e.g. is there a dog + this image = yes/no). The actor is trained with supervised learning and the contextualiser is trained with reinforcement learning, as errors cannot propagate between networks.
Figure 2: An agent can act in three different capacities during training. It can contextualise, meaning it takes the task ID and produces a message. It can behave as the actor, meaning it uses the message and the input to produce the target output during these training instances. It can be an actor for examples it contextualises and ones other agents contextualise. There is then a subset of data that is never seen by the agent, this is used to evaluate the generalisation of the generated language.
Figure 3: Classification accuracy on unseen data across all agents for n=3 truth table size. The number of example that are contextulised by an agent is fixed and the parameter $r_a$, determining the number of examples that an agent is an actor for but not a contextualiser, is investigated. It can be seen that as the number of 'taught' examples increased the generalisation of the communication also increases, tending towards 100% testing accuracy.

Emergent Explainability: Adding a causal chain to neural network inference

TL;DR

Abstract

Emergent Explainability: Adding a causal chain to neural network inference

Authors

TL;DR

Abstract

Table of Contents

Figures (3)