Table of Contents
Fetching ...

Directed Structural Adaptation to Overcome Statistical Conflicts and Enable Continual Learning

Zeki Doruk Erden, Boi Faltings

TL;DR

This work introduces DIRAD, a directed, gradient-driven structural adaptation framework that grows neural topologies only as needed to solve tasks, addressing the limitations of fixed topologies and catastrophic forgetting. Building on DIRAD, the PREVAL framework autonomously detects novel data and allocates it to appropriate models without task labels, enabling continual learning with retention. DIRAD uses adaptive potentials and edge-node conversion to unleash previously exhausted adaptation pathways, while PREVAL leverages prediction validation across L0 and L1 networks and multiple models to manage tasks and data streams. The combined approach demonstrates low-complexity task solutions and sustained performance across sequential tasks, with meaningful but imperfect task discernability acknowledged as a key determinant of ultimate performance and scalability.

Abstract

Adaptive networks today rely on overparameterized fixed topologies that cannot break through the statistical conflicts they encounter in the data they are exposed to, and are prone to "catastrophic forgetting" as the network attempts to reuse the existing structures to learn new task. We propose a structural adaptation method, DIRAD, that can complexify as needed and in a directed manner without being limited by statistical conflicts within a dataset. We then extend this method and present the PREVAL framework, designed to prevent "catastrophic forgetting" in continual learning by detection of new data and assigning encountered data to suitable models adapted to process them, without needing task labels anywhere in the workflow. We show the reliability of the DIRAD in growing a network with high performance and orders-of-magnitude simpler than fixed topology networks; and demonstrate the proof-of-concept operation of PREVAL, in which continual adaptation to new tasks is observed while being able to detect and discern previously-encountered tasks.

Directed Structural Adaptation to Overcome Statistical Conflicts and Enable Continual Learning

TL;DR

This work introduces DIRAD, a directed, gradient-driven structural adaptation framework that grows neural topologies only as needed to solve tasks, addressing the limitations of fixed topologies and catastrophic forgetting. Building on DIRAD, the PREVAL framework autonomously detects novel data and allocates it to appropriate models without task labels, enabling continual learning with retention. DIRAD uses adaptive potentials and edge-node conversion to unleash previously exhausted adaptation pathways, while PREVAL leverages prediction validation across L0 and L1 networks and multiple models to manage tasks and data streams. The combined approach demonstrates low-complexity task solutions and sustained performance across sequential tasks, with meaningful but imperfect task discernability acknowledged as a key determinant of ultimate performance and scalability.

Abstract

Adaptive networks today rely on overparameterized fixed topologies that cannot break through the statistical conflicts they encounter in the data they are exposed to, and are prone to "catastrophic forgetting" as the network attempts to reuse the existing structures to learn new task. We propose a structural adaptation method, DIRAD, that can complexify as needed and in a directed manner without being limited by statistical conflicts within a dataset. We then extend this method and present the PREVAL framework, designed to prevent "catastrophic forgetting" in continual learning by detection of new data and assigning encountered data to suitable models adapted to process them, without needing task labels anywhere in the workflow. We show the reliability of the DIRAD in growing a network with high performance and orders-of-magnitude simpler than fixed topology networks; and demonstrate the proof-of-concept operation of PREVAL, in which continual adaptation to new tasks is observed while being able to detect and discern previously-encountered tasks.

Paper Structure

This paper contains 35 sections, 22 equations, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: A simplified illustrative case of the path of adaptation for signed XOR ("False" represented by $-1$ instead of $0$). Inputs: $x_0$, $x_1$. Output: $y$. In the figures, $G_e$ represent $dC/dw_e$, $a_i$ state of node $i$, and the four values in parentheses represent the signs that a variable takes for the four samples, respectively. We simplify by assuming no bias and that the adaptation process of different components happen in sequence instead of simultaneously.
  • Figure 2: Simplified representation of PREVAL flow.
  • Figure 3: Sample progress of adaptation and complexity on a single task (2-class classification, 6 & 7). The average (mean squared) error is the average across all target nodes, which are the output nodes during L0 adaptation and become (transition marked) the L1 target nodes during L1 adaptation.