Table of Contents
Fetching ...

Brain-inspired continual pre-trained learner via silent synaptic consolidation

Xuming Ran, Juntao Yao, Yusong Wang, Mingkun Xu, Dianbo Liu

TL;DR

The Artsy is introduced, inspired by the activation mechanisms of silent synapses via spike-timing-dependent plasticity observed in mature brains, to enhance the continual learning capabilities of pre-trained models and offers a promising avenue for simulating biological synaptic mechanisms, potentially advancing the understanding of neural plasticity in both artificial and biological systems.

Abstract

Pre-trained models have demonstrated impressive generalization capabilities, yet they remain vulnerable to catastrophic forgetting when incrementally trained on new tasks. Existing architecture-based strategies encounter two primary challenges: 1) Integrating a pre-trained network with a trainable sub-network complicates the delicate balance between learning plasticity and memory stability across evolving tasks during learning. 2) The absence of robust interconnections between pre-trained networks and various sub-networks limits the effective retrieval of pertinent information during inference. In this study, we introduce the Artsy, inspired by the activation mechanisms of silent synapses via spike-timing-dependent plasticity observed in mature brains, to enhance the continual learning capabilities of pre-trained models. The Artsy integrates two key components: During training, the Artsy mimics mature brain dynamics by maintaining memory stability for previously learned knowledge within the pre-trained network while simultaneously promoting learning plasticity in task-specific sub-networks. During inference, artificial silent and functional synapses are utilized to establish precise connections between the pre-synaptic neurons in the pre-trained network and the post-synaptic neurons in the sub-networks, facilitated through synaptic consolidation, thereby enabling effective extraction of relevant information from test samples. Comprehensive experimental evaluations reveal that our model significantly outperforms conventional methods on class-incremental learning tasks, while also providing enhanced biological interpretability for architecture-based approaches. Moreover, we propose that the Artsy offers a promising avenue for simulating biological synaptic mechanisms, potentially advancing our understanding of neural plasticity in both artificial and biological systems.

Brain-inspired continual pre-trained learner via silent synaptic consolidation

TL;DR

The Artsy is introduced, inspired by the activation mechanisms of silent synapses via spike-timing-dependent plasticity observed in mature brains, to enhance the continual learning capabilities of pre-trained models and offers a promising avenue for simulating biological synaptic mechanisms, potentially advancing the understanding of neural plasticity in both artificial and biological systems.

Abstract

Pre-trained models have demonstrated impressive generalization capabilities, yet they remain vulnerable to catastrophic forgetting when incrementally trained on new tasks. Existing architecture-based strategies encounter two primary challenges: 1) Integrating a pre-trained network with a trainable sub-network complicates the delicate balance between learning plasticity and memory stability across evolving tasks during learning. 2) The absence of robust interconnections between pre-trained networks and various sub-networks limits the effective retrieval of pertinent information during inference. In this study, we introduce the Artsy, inspired by the activation mechanisms of silent synapses via spike-timing-dependent plasticity observed in mature brains, to enhance the continual learning capabilities of pre-trained models. The Artsy integrates two key components: During training, the Artsy mimics mature brain dynamics by maintaining memory stability for previously learned knowledge within the pre-trained network while simultaneously promoting learning plasticity in task-specific sub-networks. During inference, artificial silent and functional synapses are utilized to establish precise connections between the pre-synaptic neurons in the pre-trained network and the post-synaptic neurons in the sub-networks, facilitated through synaptic consolidation, thereby enabling effective extraction of relevant information from test samples. Comprehensive experimental evaluations reveal that our model significantly outperforms conventional methods on class-incremental learning tasks, while also providing enhanced biological interpretability for architecture-based approaches. Moreover, we propose that the Artsy offers a promising avenue for simulating biological synaptic mechanisms, potentially advancing our understanding of neural plasticity in both artificial and biological systems.
Paper Structure (15 sections, 5 equations, 4 figures, 1 table, 2 algorithms)

This paper contains 15 sections, 5 equations, 4 figures, 1 table, 2 algorithms.

Figures (4)

  • Figure 1: Illustration of artificial networks ($f$ and $g$) trained using various learning methods on a continual learning task $j$ where $i < j$. Initially, network $f$ is trained via back-propagation on task $i$. The goal of continual learning is to sequentially learn the subsequent task $j$. (A) With $f$ fixed, the parameters $\theta_i$ are optimized to $\theta_j$ on task $j$, enabling generalization to task $j$. (B) Keeping $f$ fixed, the parameters $\theta_i$ are optimized to $\theta_{ij}$ on tasks $i$ and $j$, allowing generalization to both tasks $i$ and $j$. (C) With $f$ fixed at parameters $\theta_i$, an additional sub-network $g$ with parameters $\varphi_j$ is introduced to generalize to both tasks $i$ and $j$.
  • Figure 2: Overview of the connections between a pre-trained network and the initialized sub-networks via artificial silent and functional synapses. (A) In the mature brain, dendritic segments comprise silent synapses located at filopodia and functional synapses at dendritic spines. An archetypal glutamatergic synapse consists of presynaptic and postsynaptic membranes. The presynaptic terminal contains glutamate-filled vesicles, while the postsynaptic membrane contains AMPA and NMDA receptors. (B) The process of converting silent synapses into functional synapses through AMPA receptor unsilencing. AMPA unsilencing involves the synaptic incorporation of clusters of AMPA receptors, converting silent synapses into functional ones, whereas AMPA silencing entails the loss of synaptic AMPA receptor clusters. (C) In the Artsy framework, artificial silent and functional synapses connect the pre-trained network to the initialized sub-network. New stimulus inputs can induce the conversion of artificial silent synapses into functional synapses.
  • Figure 3: Comparison of learning plasticity and memory stability between Artsy and EASE. We evaluate the performance of Artsy and EASE on a 10-step class-incremental learning task using the CIFAR-100 dataset. The accuracy on both new and previously learned tasks is assessed at each step. Additionally, the final and average accuracies are computed at each step.
  • Figure 4: Ablation study results on the impact of good and bad features in activating the artificial silent synapse in Artsy for the 10-step task on CIFAR-100. Panels (A) and (B) depict the Average and Last accuracies at each step, respectively, utilizing the artificial synapse with distinct features.