Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

Lisa Schut; Nenad Tomasev; Tom McGrath; Demis Hassabis; Ulrich Paquet; Been Kim

Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

Lisa Schut, Nenad Tomasev, Tom McGrath, Demis Hassabis, Ulrich Paquet, Been Kim

TL;DR

This work addresses the gap between human knowledge and super-human AI knowledge by extracting new concepts embedded in AlphaZero's latent space and search process. It introduces a convex-optimization framework to uncover both static and dynamic chess concepts, followed by teachability and novelty filters to ensure usefulness and novelty beyond human data. Human experts are then engaged via prototype-based teaching to assess learnability and application, with four grandmasters showing improvements after exposure to AZ-derived concepts. The study demonstrates a feasible pathway for translating machine-encoded knowledge into human expertise, offering a blueprint for human-AI knowledge transfer across domains and highlighting differences in priors, objectives, and computational budgets between humans and AI systems.

Abstract

Artificial Intelligence (AI) systems have made remarkable progress, attaining super-human performance across various domains. This presents us with an opportunity to further human knowledge and improve human expert performance by leveraging the hidden knowledge encoded within these highly performant AI systems. Yet, this knowledge is often hard to extract, and may be hard to understand or learn from. Here, we show that this is possible by proposing a new method that allows us to extract new chess concepts in AlphaZero, an AI system that mastered the game of chess via self-play without human supervision. Our analysis indicates that AlphaZero may encode knowledge that extends beyond the existing human knowledge, but knowledge that is ultimately not beyond human grasp, and can be successfully learned from. In a human study, we show that these concepts are learnable by top human experts, as four top chess grandmasters show improvements in solving the presented concept prototype positions. This marks an important first milestone in advancing the frontier of human knowledge by leveraging AI; a development that could bear profound implications and help us shape how we interact with AI systems across many AI applications.

Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

TL;DR

Abstract

Paper Structure (62 sections, 19 equations, 29 figures)

This paper contains 62 sections, 19 equations, 29 figures.

Introduction
Related work
Concept-based explanations
Generating explanations in Reinforcement Learning
Chess and Artificial Intelligence
What are concepts?
Discovering concepts
Excavating concept vectors
Concept constraints for static concepts
Concept constraints for dynamic concepts
Filtering concepts
Teachability
Selecting prototypes.
Teaching and measuring learning.
Novelty
...and 47 more sections

Figures (29)

Figure 1: Learning from machine-unique knowledge.
Figure 2: Example of a concept prototype. Most chess players would opt for Rxh5, however, AZ plays Qc1, with the idea of regrouping the pieces to the queenside. Further details can be found in §\ref{['appx:proto_human']}.
Figure 3: Contrasting the optimal rollout with subpar MCTS rollouts at different time steps. The green rollout shows the optimal rollout, and the red rollouts depict subpar trajectories. At each time step, MCTS finds subpar trajectories. We include each of these pairs in the concept constraints.
Figure 4: Teachability: AZ Concepts. The y-axis shows how often the student and teacher select the same move (normalised version of Equation \ref{['eq:teachability_train']}), and the x-axis shows the training time step. The dark dotted lines show the level of a training checkpoint at which AZ obtains the same level on the concept set as our student. Each plot is a different concept found in layer $19$ (top) and in layer $23$ (bottom).
Figure 5: Filtering concepts based on novelty scores. Concepts for which the reconstruction error using AZ's basis vectors is less than the reconstruction error using human game's basis vectors for every $k$ are accepted (not filtered). The darker green and blue lines show the average over the accepted and rejected concepts.
...and 24 more figures

Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

TL;DR

Abstract

Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

Authors

TL;DR

Abstract

Table of Contents

Figures (29)