Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

Manfred Diaz; Liam Paull; Andrea Tacchetti

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

Manfred Diaz, Liam Paull, Andrea Tacchetti

TL;DR

A data-centric perspective is proposed to analyze the underlying mechanics of the teacher-student interactions in TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning.

Abstract

Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and learning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose a data-centric perspective to analyze the underlying mechanics of the teacher-student interactions in TSCL. We leverage cooperative game theory to describe how the composition of the set of experiences presented by the teacher to the learner, as well as their order, influences the performance of the curriculum that is found by TSCL approaches. To do so, we demonstrate that for every TSCL problem, an equivalent cooperative game exists, and several key components of the TSCL framework can be reinterpreted using game-theoretic principles. Through experiments covering supervised learning, reinforcement learning, and classical games, we estimate the cooperative values of experiences and use value-proportional curriculum mechanisms to construct curricula, even in cases where TSCL struggles. The framework and experimental setup we present in this work represents a novel foundation for a deeper exploration of TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning.

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

TL;DR

Abstract

Paper Structure (31 sections, 9 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 31 sections, 9 equations, 7 figures, 5 tables, 1 algorithm.

Introduction
Preliminaries
Cooperative Game Theory
Bandit Algorithms
Experience to Control
The Cooperative Mechanics of Experience
The Mechanics of Coalition Formation
Marginal Contributions to Learning
A Fair Allocation Mechanism
An Experiment on The Prospect of Cooperation
A Simulation of Cooperation
Coalitional Mechanics and Worth
A Sanity Check Through Supervised Classification
Reinforcement Learning
Classical Games
...and 16 more sections

Figures (7)

Figure 1: We validated the prospect prior using the class-as-a-unit analogy on MNIST and Cifar10. In Figures (a) and (c), each column represents units' Shapley values $\phi(\textbf{u})$ in each cooperative game parameterized by a target-unit $\bar{\textbf{u}}$ and the target coalition of all units. In Figures (b) and (d), we present the vPoP decomposition matrix (\ref{['eq:vpop']}) measuring the pairwise interaction values $\phi(\textbf{u}_i, \textbf{u}_j)$ among units in the all-units target.
Figure 2: Nowak & Radzik values (a, c) conditional on each single-unit and the multiple-units (all) evaluations, and the learning curves(b,d) for our mechanisms and TSCL.
Figure 3: The vPoP decomposition of Shapley's and Nowak & Radzik values, conditioned on the all-units evaluation target, for the MiniGrid-Rooms(a, b) and A-SIPD(c, d) problem settings.
Figure 4: The class-as-a-unit analogy applied to MNIST (a) and Cifar10 (b) served as our ground truth. For each problem, we derived the Shapley's value from the precomputed priors (left) [\ref{['eq:shapley']}] on each cooperative game (\ref{['sec:expensive-prior']}). Our results verify that units values on the target-unit settings approximately ordered the most confused pairs of classes. For instance, digits 2 & 7 in MNIST, or dog & cat in Cifar10. When the target is all classes, the vPoP decomposition (right) also (\ref{['sec:background-games']}) identifies interfering pairs.
Figure 5: We also investigated the prior-proportional curriculum in the target-unit setting. For each target unit, we allocate to each training unit interactions proportional to their pre-computed values for each target. For the Adversarial-SIPD and MiniGrid-Rooms controlled their learning dynamics by presenting the units according to unordered and ordered mechanisms in \ref{['sec:expensive-prior']}. On each task, the value-proportional curriculum derived from the prospect priot outperforms TSCL (tscl-*-exp3s). We further investigate the reason for TSCL failures on this scenario.
...and 2 more figures

Theorems & Definitions (7)

Example 3.1
Example 3.2
Example 4.1
Definition 4.1
Example 4.2
Example 5.1
Example 5.2

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

TL;DR

Abstract

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (7)