Evolving Afferent Architectures: Biologically-inspired Models for Damage-Avoidance Learning
Wolfgang Maass, Sabine Janzen, Prajvi Saxena, Sach Mukherjee
TL;DR
This paper tackles long-horizon damage-avoidance learning by introducing Computational Afferent Traces (CATs), internal risk signals produced by evolved afferent arrays. It presents a bi-level architecture where outer-loop evolution via CMA-ES discovers afferent sensing configurations and inner-loop PPO trains damage-avoidance policies using CATs, formalizing an inductive bias toward learnability. The approach is demonstrated on biomechanical digital twins, showing evolved CAT architectures achieve 2.8x CAT efficiency and 15.4x age-robustness versus hand-designed baselines, along with a 23% reduction in high-risk actions; ablations confirm the necessity of CAT signals, evolution, and predictive discrepancy. Episodic memory (AMM) further enhances adaptation, and generalization across pathological knee conditions with age-dependent action restriction is demonstrated. The work provides a reproducible framework and open data to advance internal risk signaling in AI for biomedical and embodied systems.
Abstract
We introduce Afferent Learning, a framework that produces Computational Afferent Traces (CATs) as adaptive, internal risk signals for damage-avoidance learning. Inspired by biological systems, the framework uses a two-level architecture: evolutionary optimization (outer loop) discovers afferent sensing architectures that enable effective policy learning, while reinforcement learning (inner loop) trains damage-avoidance policies using these signals. This formalizes afferent sensing as providing an inductive bias for efficient learning: architectures are selected based on their ability to enable effective learning (rather than directly minimizing damage). We provide theoretical convergence guarantees under smoothness and bounded-noise assumptions. We illustrate the general approach in the challenging context of biomechanical digital twins operating over long time horizons (multiple decades of the life-course). Here, we find that CAT-based evolved architectures achieve significantly higher efficiency and better age-robustness than hand-designed baselines, enabling policies that exhibit age-dependent behavioral adaptation (23% reduction in high-risk actions). Ablation studies validate CAT signals, evolution, and predictive discrepancy as essential. We release code and data for reproducibility.
