Table of Contents
Fetching ...

Privacy-preserving fall detection at the edge using Sony IMX636 event-based vision sensor and Intel Loihi 2 neuromorphic processor

Lyes Khacef, Philipp Weidel, Susumu Hogyoku, Harry Liu, Claire Alexandra Bräuer, Shunsuke Koshino, Takeshi Oyakawa, Vincent Parret, Yoshitaka Miyatani, Mike Davies, Mathis Richter

TL;DR

This work tackles privacy-sensitive, edge-based fall detection by fusing a neuromorphic processing platform (Loihi 2) with an event-based vision sensor (IMX636) via a dedicated FPGA interface. It systematically explores architecture and neuron-model combinations (including graded LIF, SigmaDelta, and S4D-based temporal processing) and demonstrates a patched MCUNet+S4D approach that achieves the highest F1 score (~84%) with modest power (~90 mW) on a single Loihi 2 chip. The results establish a Pareto frontier between detection accuracy and computational cost, highlighting strong performance gains from architecture choices and temporal feature extraction, while also addressing practical concerns like latency, memory constraints, and on-chip data anonymization. Overall, the work provides a concrete, privacy-preserving pathway for real-time edge AI in smart cameras, illuminating design trade-offs for integrated event-based sensing and neuromorphic processing at scale.

Abstract

Fall detection for elderly care using non-invasive vision-based systems remains an important yet unsolved problem. Driven by strict privacy requirements, inference must run at the edge of the vision sensor, demanding robust, real-time, and always-on perception under tight hardware constraints. To address these challenges, we propose a neuromorphic fall detection system that integrates the Sony IMX636 event-based vision sensor with the Intel Loihi 2 neuromorphic processor via a dedicated FPGA-based interface, leveraging the sparsity of event data together with near-memory asynchronous processing. Using a newly recorded dataset under diverse environmental conditions, we explore the design space of sparse neural networks deployable on a single Loihi 2 chip and analyze the tradeoffs between detection F1 score and computational cost. Notably, on the Pareto front, our LIF-based convolutional SNN with graded spikes achieves the highest computational efficiency, reaching a 55x synaptic operations sparsity for an F1 score of 58%. The LIF with graded spikes shows a gain of 6% in F1 score with 5x less operations compared to binary spikes. Furthermore, our MCUNet feature extractor with patched inference, combined with the S4D state space model, achieves the highest F1 score of 84% with a synaptic operations sparsity of 2x and a total power consumption of 90 mW on Loihi 2. Overall, our smart security camera proof-of-concept highlights the potential of integrating neuromorphic sensing and processing for edge AI applications where latency, energy consumption, and privacy are critical.

Privacy-preserving fall detection at the edge using Sony IMX636 event-based vision sensor and Intel Loihi 2 neuromorphic processor

TL;DR

This work tackles privacy-sensitive, edge-based fall detection by fusing a neuromorphic processing platform (Loihi 2) with an event-based vision sensor (IMX636) via a dedicated FPGA interface. It systematically explores architecture and neuron-model combinations (including graded LIF, SigmaDelta, and S4D-based temporal processing) and demonstrates a patched MCUNet+S4D approach that achieves the highest F1 score (~84%) with modest power (~90 mW) on a single Loihi 2 chip. The results establish a Pareto frontier between detection accuracy and computational cost, highlighting strong performance gains from architecture choices and temporal feature extraction, while also addressing practical concerns like latency, memory constraints, and on-chip data anonymization. Overall, the work provides a concrete, privacy-preserving pathway for real-time edge AI in smart cameras, illuminating design trade-offs for integrated event-based sensing and neuromorphic processing at scale.

Abstract

Fall detection for elderly care using non-invasive vision-based systems remains an important yet unsolved problem. Driven by strict privacy requirements, inference must run at the edge of the vision sensor, demanding robust, real-time, and always-on perception under tight hardware constraints. To address these challenges, we propose a neuromorphic fall detection system that integrates the Sony IMX636 event-based vision sensor with the Intel Loihi 2 neuromorphic processor via a dedicated FPGA-based interface, leveraging the sparsity of event data together with near-memory asynchronous processing. Using a newly recorded dataset under diverse environmental conditions, we explore the design space of sparse neural networks deployable on a single Loihi 2 chip and analyze the tradeoffs between detection F1 score and computational cost. Notably, on the Pareto front, our LIF-based convolutional SNN with graded spikes achieves the highest computational efficiency, reaching a 55x synaptic operations sparsity for an F1 score of 58%. The LIF with graded spikes shows a gain of 6% in F1 score with 5x less operations compared to binary spikes. Furthermore, our MCUNet feature extractor with patched inference, combined with the S4D state space model, achieves the highest F1 score of 84% with a synaptic operations sparsity of 2x and a total power consumption of 90 mW on Loihi 2. Overall, our smart security camera proof-of-concept highlights the potential of integrating neuromorphic sensing and processing for edge AI applications where latency, energy consumption, and privacy are critical.

Paper Structure

This paper contains 19 sections, 6 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Hardware system overview showing the whole system (top left), the KP-EVS interface board (top right), the host board with IO and power interfaces (bottom left), and a Kapoho Point (KP) board with 8 Loihi 2 chips (only one chip is used in our work) (bottom right).
  • Figure 2: Hardware system pipeline with details of Max10 FPGA interface.
  • Figure 3: Input-patched inference of MCU13B model for a single event frame, which reduces the memory requirements by an order of magnitude by reusing the Loihi 2 neuro-core memories for each patch.
  • Figure 4: Spike (activation) functions (left) and surrogate gradient functions (right) of LIF neurons with binary and graded spikes.
  • Figure 5: Fall detection algorithmic-level benchmarking at 16 predictions/s of single Loihi 2 chip compatible models, highlighting the tradeoffs between $F_1$ score and computational cost.
  • ...and 1 more figures