From Vision to Decision: Neuromorphic Control for Autonomous Navigation and Tracking

Chuwei Wang; Eduardo Sebastián; Amanda Prorok; Anastasia Bizyaeva

From Vision to Decision: Neuromorphic Control for Autonomous Navigation and Tracking

Chuwei Wang, Eduardo Sebastián, Amanda Prorok, Anastasia Bizyaeva

TL;DR

The paper introduces a parsimonious neuromorphic control framework that converts high-dimensional vision into actionable egocentric motion via neural populations evolving on a simplex. A dynamical bifurcation mechanism resolves symmetry-induced indecision, enabling fast, planner-like decisions directly from sensors with low computational load. The authors develop a discrete-time neural model, a coarse-grained reduction for clustered inputs, and a bifurcation analysis showing a pitchfork structure with ultrasensitive unfolding in asymmetric cases. They validate the approach through simulations in 2D and photorealistic environments, multi-target tracking, and real-world quadrotor experiments, demonstrating robust performance under noise, occlusion, and hardware perturbations. The work highlights a path toward efficient, interpretable, neuromorphic autonomy that bridges proximal perception with distal decision-making, potentially enabling hardware implementations on neuromorphic platforms.

Abstract

Robotic navigation has historically struggled to reconcile reactive, sensor-based control with the decisive capabilities of model-based planners. This duality becomes critical when the absence of a predominant option among goals leads to indecision, challenging reactive systems to break symmetries without computationally-intense planners. We propose a parsimonious neuromorphic control framework that bridges this gap for vision-guided navigation and tracking. Image pixels from an onboard camera are encoded as inputs to dynamic neuronal populations that directly transform visual target excitation into egocentric motion commands. A dynamic bifurcation mechanism resolves indecision by delaying commitment until a critical point induced by the environmental geometry. Inspired by recently proposed mechanistic models of animal cognition and opinion dynamics, the neuromorphic controller provides real-time autonomy with a minimal computational burden, a small number of interpretable parameters, and can be seamlessly integrated with application-specific image processing pipelines. We validate our approach in simulation environments as well as on an experimental quadrotor platform.

From Vision to Decision: Neuromorphic Control for Autonomous Navigation and Tracking

TL;DR

Abstract

Paper Structure (27 sections, 1 theorem, 66 equations, 8 figures, 2 tables)

This paper contains 27 sections, 1 theorem, 66 equations, 8 figures, 2 tables.

Introduction
Results
The geometry of robot spatial decision-making
Vision-based decision-making on the move
Vision-based neuromorphic navigation and tracking
Autonomous decision-making in the physical world
Discussion
Methods
Model formulation
Discrete-time neural dynamics model
Coarse-grained model for clustered inputs
Timescale separation and bifurcation analysis
Bifurcation analysis of two-option coarse-grained ND model
Heuristic to detect decision points in general ND model
Simulation environment
...and 12 more sections

Key Result

Theorem 1

For the neural dynamics eq:ndt_vector_1 under the stated assumptions, given any piecewise continuous input $\mathbf{u}: [0,\infty) \to U \subset \mathbb{R}^m$, and any initial condition $\mathbf{n}(0) \in \Delta_{k-1}$, every corresponding solution $\mathbf{n}(t)$ is unique and satisfies $\mathbf{n}

Figures (8)

Figure 1: Overview of the neuromorphic control framework. a We consider an autonomous robotic system designed to run perception and computation onboard. In this case, a quadrotor is equipped with an RGB camera and a Raspberry Pi as its sole processing unit. The motion capture markers are included for completeness, as they are used to record data for statistical trajectory analysis only. b The neuromorphic controller processes the image stream to extract relevant target features, translated into a value per pixel that models the strength of the feature. This is used as input to the neural dynamics, that evolve the relative preference over per-pixel spatial directions, yielding a velocity command in the egocentric body frame of the robot as a combination of those directions whose activity is above a threshold. c The velocity command is transformed into action commands that steer the robot towards the direction of preference according to the visual stimuli.
Figure 2: The geometry of decision-making for MPC, PF, RL and our method. Each subplot shows $20$ simulated trajectories of the robot navigating to one of the targets (indicated by gray filled disks of radius $\epsilon = 0.5$ m), starting from the origin. The four planners are tested across three environments to evaluate their ability to break symmetry and exploit environmental geometry, from top to bottom: symmetric two-goal, symmetric three-goal, and asymmetric three-goal. For MPC (left column), two formulations are compared: MPC1 (weighted-sum, blue) and MPC2 (soft-min, orange). For RL, the same policy is used but with different target orderings provided during simulation, where each color corresponds to a distinct goal sequence as observations. For ND, the saturation bound parameter is $a=2$ and the coupling gain is $\alpha = 6$.
Figure 3: Autonomous navigation with simulated visual inputs. The robot has a FOV of $110^{\circ}$ and neuron population $k = 64$ ($a = \frac{8}{k}$ and $\alpha = 4.2$). a Trajectories in the symmetric setting. The red curve shows the path generated by the neural-dynamics controller, while the gray dashed curve corresponds to the trajectory driven purely by the instantaneous visual input (the $\mathcal{U}$ matrix). The circular marker on the trajectory denotes the moment at which the neural dynamics break symmetry and commit to one of the two identical targets. b Full goal-directed trajectory; shaded blue wedges indicate the FOV; circular targets are color-coded by radius. The orange circular marker on the trajectory denotes bifurcation-driven decisions while the black cross markers denote vision-driven decisions. c Top: Neural activity heatmap showing how the population response evolves over time across pixel indices. Bottom: Corresponding heading angle (relative to the $+x$ axis) and the real part of the dominant Jacobian eigenvalue $\lambda_1$. The vertical dashed line marks the same symmetry-breaking moment indicated by the circular marker in (a), where activity concentrates onto one target band, the heading rapidly changes, and $\operatorname{Re}(\lambda_1)$ spikes. All panels share a common time axis and correspond to the neural-driven trajectory in (a). d Same panel as c corresponding to the trajectory in b.
Figure 4: Vision-based navigation in a photorealistic environment. The quadrotor has a FOV of $120^{\circ}$ and neural population $k=36\times64$ ($a=\frac{5}{k}$ and $\alpha = 3$). a 3D trajectory. The star marks the start and the circle denotes the end of the path. Gray cross marks indicate six key positions whose snapshots are shown in (c). b Top-down projection of the trajectory in (a), with the FOV cones overlaid. c First-person perspective snapshots (top row) and their corresponding intermediate representations at six key moments: (1) start, (2) approaching target T1, (3) transitioning from T1 to T3, (4) approaching T3, (5) choosing between T4 and T6, and (6) approaching T4. The second row shows the pixel-wise target evidence map $\mathcal{U}$. The third row visualizes the neural activity rates representing the population activity distribution across all neurons. The bottom row shows the neural activity after thresholding, highlighting neurons that drive motion (see Video S2 for more visualizations).
Figure 5: Vision-based multi-target tracking in a photorealistic environment. The controlled quadrotor has a $120^{\circ}$ FOV and a population of $k = 36\times64$ neurons ($a=\frac{6}{k}$ and $\alpha = 6$). (a) 3D trajectories of the controlled quadrotor and three target quadrotors, with color gradients indicating temporal evolution. Gray and red cross-markers highlight the quadrotor positions at two moments, $t=9$s and $t=27$s, respectively. The corresponding first-person snapshots and intermediate representations at these times are shown in the second and third columns of panel (c). (b) “Fly-with-me” view at two moments. Top: At the start, the controlled quadrotor (white dashed box) observes all three targets flying in formation (yellow dashed box) within its FOV. Bottom: the rightmost target (black dashed box) departs from the formation while the controlled quadrotor continues tracking the group. (c) First-person perspective (FPV) snapshots and intermediate neural representations at four key moments: (1) start, (2) before one quadrotor leaves formation, (3) while following the larger formation of two targets, and (4) after switching to track a single target. The second row shows residual optical flow (OF) magnitude, and the third row shows the target evidence map $\mathcal{U}$. The fourth row visualizes the neural activity distribution while the bottom row highlights thresholded neural activations that directly drive motion decisions (see Video S3 for full visualizations).
...and 3 more figures

Theorems & Definitions (2)

Theorem 1
proof

From Vision to Decision: Neuromorphic Control for Autonomous Navigation and Tracking

TL;DR

Abstract

From Vision to Decision: Neuromorphic Control for Autonomous Navigation and Tracking

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (2)