Table of Contents
Fetching ...

Attention-Driven LPLC2 Neural Ensemble Model for Multi-Target Looming Detection and Localization

Renyuan Liu, Qinbing Fu

TL;DR

This work addresses the challenge of detecting and localizing multiple looming objects in dynamic scenes. It introduces mLPLC2, an attention-driven extension of the Drosophila LPLC2 population that uses bottom-up motion salience to generate multiple attention fields and nonlinearly integrate region-specific responses. Across synthetic stimuli, natural scenes, and real UAV footage, the model demonstrates fast, accurate detection, discrimination, and tracking of multiple approaching targets. The approach offers energy-efficient, scalable collision-detection cues for mobile robots and UAVs, enabling more precise avoidance maneuvers in 3D environments.

Abstract

Lobula plate/lobula columnar, type 2 (LPLC2) visual projection neurons in the fly's visual system possess highly looming-selective properties, making them ideal for developing artificial collision detection systems. The four dendritic branches of individual LPLC2 neurons, each tuned to specific directional motion, enhance the robustness of looming detection by utilizing radial motion opponency. Existing models of LPLC2 neurons either concentrate on individual cells to detect centroid-focused expansion or utilize population-voting strategies to obtain global collision information. However, their potential for addressing multi-target collision scenarios remains largely untapped. In this study, we propose a numerical model for LPLC2 populations, leveraging a bottom-up attention mechanism driven by motion-sensitive neural pathways to generate attention fields (AFs). This integration of AFs with highly nonlinear LPLC2 responses enables precise and continuous detection of multiple looming objects emanating from any region of the visual field. We began by conducting comparative experiments to evaluate the proposed model against two related models, highlighting its unique characteristics. Next, we tested its ability to detect multiple targets in dynamic natural scenarios. Finally, we validated the model using real-world video data collected by aerial robots. Experimental results demonstrate that the proposed model excels in detecting, distinguishing, and tracking multiple looming targets with remarkable speed and accuracy. This advanced ability to detect and localize looming objects, especially in complex and dynamic environments, holds great promise for overcoming collision-detection challenges in mobile intelligent machines.

Attention-Driven LPLC2 Neural Ensemble Model for Multi-Target Looming Detection and Localization

TL;DR

This work addresses the challenge of detecting and localizing multiple looming objects in dynamic scenes. It introduces mLPLC2, an attention-driven extension of the Drosophila LPLC2 population that uses bottom-up motion salience to generate multiple attention fields and nonlinearly integrate region-specific responses. Across synthetic stimuli, natural scenes, and real UAV footage, the model demonstrates fast, accurate detection, discrimination, and tracking of multiple approaching targets. The approach offers energy-efficient, scalable collision-detection cues for mobile robots and UAVs, enabling more precise avoidance maneuvers in 3D environments.

Abstract

Lobula plate/lobula columnar, type 2 (LPLC2) visual projection neurons in the fly's visual system possess highly looming-selective properties, making them ideal for developing artificial collision detection systems. The four dendritic branches of individual LPLC2 neurons, each tuned to specific directional motion, enhance the robustness of looming detection by utilizing radial motion opponency. Existing models of LPLC2 neurons either concentrate on individual cells to detect centroid-focused expansion or utilize population-voting strategies to obtain global collision information. However, their potential for addressing multi-target collision scenarios remains largely untapped. In this study, we propose a numerical model for LPLC2 populations, leveraging a bottom-up attention mechanism driven by motion-sensitive neural pathways to generate attention fields (AFs). This integration of AFs with highly nonlinear LPLC2 responses enables precise and continuous detection of multiple looming objects emanating from any region of the visual field. We began by conducting comparative experiments to evaluate the proposed model against two related models, highlighting its unique characteristics. Next, we tested its ability to detect multiple targets in dynamic natural scenarios. Finally, we validated the model using real-world video data collected by aerial robots. Experimental results demonstrate that the proposed model excels in detecting, distinguishing, and tracking multiple looming targets with remarkable speed and accuracy. This advanced ability to detect and localize looming objects, especially in complex and dynamic environments, holds great promise for overcoming collision-detection challenges in mobile intelligent machines.

Paper Structure

This paper contains 12 sections, 15 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: The schematic anatomical structure of LPLC2 neurons is as follows: (a) Each arm of LPLC2's cross-shaped primary dendrites ramifies in one of the lobula plate layers and extends along that layer's preferred motion direction. (b) A single LPLC2 cell integrates directional signals from four sub-layers of lobula plate. (c) LPLC2 cells are a population of $\sim$80 visual projection neurons, with their dendrites collectively covering the lobula plate.
  • Figure 2: The flowchart of the proposed mLPLC2 neural network outlines the following steps. The stimulation is input into the neural network, where the difference between consecutive frames is calculated to extract edge expansion information. Excitation and inhibition signals from the lamina monopolar cells (LMCs) are integrated using the vDoG (variant Difference of Gaussian) simulation with polarity selectivity. Motion signals representing brightness increases and decreases are sent into the ON and OFF channels, respectively, where they undergo contrast normalization. A triple-correlation model simulates T4 and T5 cells to extract motion in four cardinal directions. Directional signals are integrated to generate the local motion signal. The most suspicious local motion signal is selected to drive a bottom-up process, identifying a new spatial attention field (AF). The mLPLC2 model non-linearly integrates motion signals from multiple AFs, enabling the detection and localization of multiple looming objects.
  • Figure 3: Different models' responses to various motion characteristics (sLPLC2 is a constrained version of mLPLC2, limited to at most a single AF). In the figure, each square represents a complete experiment, with the x-axis indicating time in frames and the y-axis representing the corresponding membrane potential of the models. The LPLC2-based neural network model exhibits selectivity to only approaching stimuli.
  • Figure 4: Different models' responses to objects from various regions of the visual field across four phases. The horizontal and vertical axes represent time in frames and membrane potential, respectively. The widths and heights of the differently colored shapes above the graph indicate the appearance period and size of the respective objects. Each center of expansion is marked with corresponding color. In the face of multi-target motion, only mLPLC2 can accurately distinguish and identify the looming objects.
  • Figure 5: mLPLC2's responses to objects originating from different regions of the visual field and at various times, amidst a shifting natural background. The presence period and size of each object are represented by the widths and heights of differently colored shapes at the top of the graph. The center positions and presence periods of AFs are shown in the legend, corresponding closely to the 6 looming objects. In response to these 6 objects, mLPLC2 continuously detects and locks onto multiple looming targets, demonstrating effective responsiveness.
  • ...and 1 more figures