Table of Contents
Fetching ...

Neuromodulated Meta-Learning

Jingyao Wang, Huijie Guo, Wenwen Qiang, Jiangmeng Li, Changwen Zheng, Hui Xiong, Gang Hua

TL;DR

Neuromodulated Meta-Learning (NeuronML) is proposed to model FNS in meta-learning, ensuring meta-learning to generate the optimal structure for each task, thereby maximizing the performance and learning efficiency of meta-learning.

Abstract

Humans excel at adapting perceptions and actions to diverse environments, enabling efficient interaction with the external world. This adaptive capability relies on the biological nervous system (BNS), which activates different brain regions for distinct tasks. Meta-learning similarly trains machines to handle multiple tasks but relies on a fixed network structure, not as flexible as BNS. To investigate the role of flexible network structure (FNS) in meta-learning, we conduct extensive empirical and theoretical analyses, finding that model performance is tied to structure, with no universally optimal pattern across tasks. This reveals the crucial role of FNS in meta-learning, ensuring meta-learning to generate the optimal structure for each task, thereby maximizing the performance and learning efficiency of meta-learning. Motivated by this insight, we propose to define, measure, and model FNS in meta-learning. First, we define that an effective FNS should possess frugality, plasticity, and sensitivity. Then, to quantify FNS in practice, we present three measurements for these properties, collectively forming the \emph{structure constraint} with theoretical supports. Building on this, we finally propose Neuromodulated Meta-Learning (NeuronML) to model FNS in meta-learning. It utilizes bi-level optimization to update both weights and structure with the structure constraint. Extensive theoretical and empirical evaluations demonstrate the effectiveness of NeuronML on various tasks. Code is publicly available at \href{https://github.com/WangJingyao07/NeuronML}{https://github.com/WangJingyao07/NeuronML}.

Neuromodulated Meta-Learning

TL;DR

Neuromodulated Meta-Learning (NeuronML) is proposed to model FNS in meta-learning, ensuring meta-learning to generate the optimal structure for each task, thereby maximizing the performance and learning efficiency of meta-learning.

Abstract

Humans excel at adapting perceptions and actions to diverse environments, enabling efficient interaction with the external world. This adaptive capability relies on the biological nervous system (BNS), which activates different brain regions for distinct tasks. Meta-learning similarly trains machines to handle multiple tasks but relies on a fixed network structure, not as flexible as BNS. To investigate the role of flexible network structure (FNS) in meta-learning, we conduct extensive empirical and theoretical analyses, finding that model performance is tied to structure, with no universally optimal pattern across tasks. This reveals the crucial role of FNS in meta-learning, ensuring meta-learning to generate the optimal structure for each task, thereby maximizing the performance and learning efficiency of meta-learning. Motivated by this insight, we propose to define, measure, and model FNS in meta-learning. First, we define that an effective FNS should possess frugality, plasticity, and sensitivity. Then, to quantify FNS in practice, we present three measurements for these properties, collectively forming the \emph{structure constraint} with theoretical supports. Building on this, we finally propose Neuromodulated Meta-Learning (NeuronML) to model FNS in meta-learning. It utilizes bi-level optimization to update both weights and structure with the structure constraint. Extensive theoretical and empirical evaluations demonstrate the effectiveness of NeuronML on various tasks. Code is publicly available at \href{https://github.com/WangJingyao07/NeuronML}{https://github.com/WangJingyao07/NeuronML}.

Paper Structure

This paper contains 59 sections, 16 theorems, 70 equations, 9 figures, 8 tables, 4 algorithms.

Key Result

Theorem 3.1

Let $f$ be any meta-learning model for the task of binary classification with respect to the 0-1 loss over the data $\mathcal{X}$ of task $\tau$. Then, there exists a distribution $P_\mathcal{X}$ over $\mathcal{X} \times \{0, 1\}$ such that:

Figures (9)

  • Figure 1: Existing meta-learning models perform learning through fixed model structures, where all parameters need to be updated for each episode of learning. In contrast, human cognition of the world is achieved by adjusting the behavior of neurons, which only needs part of the neurons while adjusting the activation area of the brain according to the downstream task. We ask if neuromodulated meta-learning can understand the world with flexible network structures.
  • Figure 2: Examples of tasks sampled from different datasets. (a) shows the data sampled from the miniImagenet dataset miniImagenet and (b) shows the data sampled from the Omniglot dataset Omniglot. The distribution of tasks sampled from different datasets may vary greatly, where the former consists of various RGB images rich in information, while the latter is composed of binary characters.
  • Figure 3: The scores of the task distribution in the four benchmark datasets, i.e., miniImagenet, Omniglot, tieredImagenet, and CIFAR-FS. A higher score means more knowledge is involved in the task.
  • Figure 4: Trade-off performance of MAML with different model structures on four benchmark datasets, i.e., miniImagenet (5-way 1-shot), Omniglot (20-way 1-shot), tieredImagenet (5-way 1-shot), and CIFAR-FS (5-way 1-shot). The horizontal axis represents the training time (hours), and the vertical axis represents the accuracy of each dataset. The area of the circle represents the model size of MAML with the corresponding model structure.
  • Figure 5: Adaptation for regression. (a) Quantitative results show the learning curves of different models at meta-test-time, where NeuronML achieves the best MSE score with fewer update steps. (b) Adaptation curves and trade-off performance. Left: adaptation curves of different models (steps = 3) on sinusoid regression. Right: the trade-off performance of different models on pose prediction. We randomly select tasks for learning, and NeuronML can better fit the waveform under limited update conditions with fewer parameters.
  • ...and 4 more figures

Theorems & Definitions (17)

  • Theorem 3.1
  • Theorem 3.2
  • Definition 4.1
  • Theorem 4.1: Measurement of Frugality
  • Theorem 4.2: Measurement of Plasticity
  • Theorem 4.3: Measurement of Sensitivity
  • Theorem 5.1
  • Corollary 5.1
  • Theorem 5.2
  • Theorem III.1
  • ...and 7 more