Table of Contents
Fetching ...

Unbiased Gradient Estimation for Event Binning via Functional Backpropagation

Jinze Chen, Wei Zhai, Han Han, Tiankai Ma, Yang Cao, Bin Li, Zheng-Jun Zha

TL;DR

A novel framework for unbiased gradient estimation of arbitrary binning functions by synthesizing weak derivatives during backpropagation while keeping the forward output unchanged is proposed, demonstrating broad benefits for event-based visual perception.

Abstract

Event-based vision encodes dynamic scenes as asynchronous spatio-temporal spikes called events. To leverage conventional image processing pipelines, events are typically binned into frames. However, binning functions are discontinuous, which truncates gradients at the frame level and forces most event-based algorithms to rely solely on frame-based features. Attempts to directly learn from raw events avoid this restriction but instead suffer from biased gradient estimation due to the discontinuities of the binning operation, ultimately limiting their learning efficiency. To address this challenge, we propose a novel framework for unbiased gradient estimation of arbitrary binning functions by synthesizing weak derivatives during backpropagation while keeping the forward output unchanged. The key idea is to exploit integration by parts: lifting the target functions to functionals yields an integral form of the derivative of the binning function during backpropagation, where the cotangent function naturally arises. By reconstructing this cotangent function from the sampled cotangent vector, we compute weak derivatives that provably match long-range finite differences of both smooth and non-smooth targets. Experimentally, our method improves simple optimization-based egomotion estimation with 3.2\% lower RMS error and 1.57$\times$ faster convergence. On complex downstream tasks, we achieve 9.4\% lower EPE in self-supervised optical flow, and 5.1\% lower RMS error in SLAM, demonstrating broad benefits for event-based visual perception. Source code can be found at https://github.com/chjz1024/EventFBP.

Unbiased Gradient Estimation for Event Binning via Functional Backpropagation

TL;DR

A novel framework for unbiased gradient estimation of arbitrary binning functions by synthesizing weak derivatives during backpropagation while keeping the forward output unchanged is proposed, demonstrating broad benefits for event-based visual perception.

Abstract

Event-based vision encodes dynamic scenes as asynchronous spatio-temporal spikes called events. To leverage conventional image processing pipelines, events are typically binned into frames. However, binning functions are discontinuous, which truncates gradients at the frame level and forces most event-based algorithms to rely solely on frame-based features. Attempts to directly learn from raw events avoid this restriction but instead suffer from biased gradient estimation due to the discontinuities of the binning operation, ultimately limiting their learning efficiency. To address this challenge, we propose a novel framework for unbiased gradient estimation of arbitrary binning functions by synthesizing weak derivatives during backpropagation while keeping the forward output unchanged. The key idea is to exploit integration by parts: lifting the target functions to functionals yields an integral form of the derivative of the binning function during backpropagation, where the cotangent function naturally arises. By reconstructing this cotangent function from the sampled cotangent vector, we compute weak derivatives that provably match long-range finite differences of both smooth and non-smooth targets. Experimentally, our method improves simple optimization-based egomotion estimation with 3.2\% lower RMS error and 1.57 faster convergence. On complex downstream tasks, we achieve 9.4\% lower EPE in self-supervised optical flow, and 5.1\% lower RMS error in SLAM, demonstrating broad benefits for event-based visual perception. Source code can be found at https://github.com/chjz1024/EventFBP.
Paper Structure (25 sections, 7 theorems, 38 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 25 sections, 7 theorems, 38 equations, 7 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{H}(X)\overset{f}{\to}\mathcal{H}(Y)\overset{g}{\to}\mathcal{H}(Z)$ where $f$ and $g$ are differentiable functionals, then the composite functional $g\circ f$ is also differentiable and has the representation:

Figures (7)

  • Figure 1: Event binning and the gradient bias problem.Top: Spatio-temporal event clouds (Events) are geometrically warped using motion parameters (Params) and aggregated into an Image of Warped Events (IWE) using a binning function. Bottom Left (Forward): The warping function $P(\theta;E)$ transforms input events $E$ and parameters $\theta$ into warped coordinates. These are processed by a discontinuous binning function $h_d$ to produce the IWE. Bottom Right (Backward): The adjoint (cotangent vector) of the IWE, denoted as $v_{h_d}$, is propagated back to update parameters. However, the discontinuity of $h_d$ results in non-computable Dirac delta functions when computing the gradient $v_p$ for the warped events. Result: The computed backpropagation gradient $G_{bp}$ deviates from the true finite difference gradient $G_{fd}$, shown in the contour plot as a "Biased!" estimation.
  • Figure 2: The proposed Functional Backpropagation (FBP) framework. To resolve discontinuities in the ordinary path (top, discrete binning $h_d$), we lift the operation to a functional space (bottom, continuous binning $h_c$). FBP bridges the two by reconstructing the continuous cotangent function $v_{h_c}$ from the discrete samples $v_{h_d}$. Using integration by parts, we replace the undefined Dirac delta evaluation with a convolution ($*$) of $v_{h_c}$ and the kernel derivative, synthesizing an unbiased gradient $v_p$.
  • Figure 3: Bias analysis results. The analytical gradients are subtracted by numerical gradients to obtain the finite-difference bias, colored by the sign of finite-difference gradients $G_{fd}$.
  • Figure 4: Optimization results for motion estimation, with combined bar charts showing RMS estimation accuracy and line charts showing the mean convergence time for every $N_e=20000$ events.
  • Figure 5: Left: all sequence average performance on DSEC. Right: Predicted optical flow (ERAFT only for qualitative visualization). Our method exhibits more robustness with fewer artifacts.
  • ...and 2 more figures

Theorems & Definitions (16)

  • Definition 1
  • Theorem 1
  • Definition 2: Fréchet
  • Theorem 2: Fermat
  • proof
  • Theorem 3: The Chain Rule
  • proof
  • Theorem 4: Riesz
  • proof
  • Definition 3: Functional Forward Mode AD
  • ...and 6 more