Table of Contents
Fetching ...

KirchhoffNet: A Scalable Ultra Fast Analog Neural Network

Zhengqi Gao, Fan-Keng Sun, Ron Rohrer, Duane S. Boning

TL;DR

KirchhoffNet introduces a physics-grounded, analog RC-circuit–style neural network where inputs are initial node voltages and outputs are voltages at a readout time, with edge-based nonlinear currents driving the dynamics $\mathbf{C}\dot{\mathbf{v}}+\mathbf{G}\mathbf{v}=\mathbf{b}$. Trained via the adjoint method, it supports time-sliced layers that can change topology across layers, using FC-like and convolution-like NE layers plus Proj extensions, enabling deep architectures without traditional layers. Empirical results across regression, image classification, and generation/density matching show competitive performance with existing Neural ODEs and graph-based models, while highlighting the potential for rapid, hardware-efficient inference. The work argues for analog hardware as a scalable paradigm for large-scale neural networks, outlining practical pathways and challenges toward circuit-level realization and fabrication. Overall, KirchhoffNet offers a promising new direction that merges physical circuit dynamics with neural computation to achieve fast, energy-aware AI accelerators.

Abstract

In this paper, we leverage a foundational principle of analog electronic circuitry, Kirchhoff's current and voltage laws, to introduce a distinctive class of neural network models termed KirchhoffNet. Essentially, KirchhoffNet is an analog circuit that can function as a neural network, utilizing its initial node voltages as the neural network input and the node voltages at a specific time point as the output. The evolution of node voltages within the specified time is dictated by learnable parameters on the edges connecting nodes. We demonstrate that KirchhoffNet is governed by a set of ordinary differential equations (ODEs), and notably, even in the absence of traditional layers (such as convolution layers), it attains state-of-the-art performances across diverse and complex machine learning tasks. Most importantly, KirchhoffNet can be potentially implemented as a low-power analog integrated circuit, leading to an appealing property -- irrespective of the number of parameters within a KirchhoffNet, its on-chip forward calculation can always be completed within a short time. This characteristic makes KirchhoffNet a promising and fundamental paradigm for implementing large-scale neural networks, opening a new avenue in analog neural networks for AI.

KirchhoffNet: A Scalable Ultra Fast Analog Neural Network

TL;DR

KirchhoffNet introduces a physics-grounded, analog RC-circuit–style neural network where inputs are initial node voltages and outputs are voltages at a readout time, with edge-based nonlinear currents driving the dynamics . Trained via the adjoint method, it supports time-sliced layers that can change topology across layers, using FC-like and convolution-like NE layers plus Proj extensions, enabling deep architectures without traditional layers. Empirical results across regression, image classification, and generation/density matching show competitive performance with existing Neural ODEs and graph-based models, while highlighting the potential for rapid, hardware-efficient inference. The work argues for analog hardware as a scalable paradigm for large-scale neural networks, outlining practical pathways and challenges toward circuit-level realization and fabrication. Overall, KirchhoffNet offers a promising new direction that merges physical circuit dynamics with neural computation to achieve fast, energy-aware AI accelerators.

Abstract

In this paper, we leverage a foundational principle of analog electronic circuitry, Kirchhoff's current and voltage laws, to introduce a distinctive class of neural network models termed KirchhoffNet. Essentially, KirchhoffNet is an analog circuit that can function as a neural network, utilizing its initial node voltages as the neural network input and the node voltages at a specific time point as the output. The evolution of node voltages within the specified time is dictated by learnable parameters on the edges connecting nodes. We demonstrate that KirchhoffNet is governed by a set of ordinary differential equations (ODEs), and notably, even in the absence of traditional layers (such as convolution layers), it attains state-of-the-art performances across diverse and complex machine learning tasks. Most importantly, KirchhoffNet can be potentially implemented as a low-power analog integrated circuit, leading to an appealing property -- irrespective of the number of parameters within a KirchhoffNet, its on-chip forward calculation can always be completed within a short time. This characteristic makes KirchhoffNet a promising and fundamental paradigm for implementing large-scale neural networks, opening a new avenue in analog neural networks for AI.
Paper Structure (13 sections, 15 equations, 10 figures, 2 tables)

This paper contains 13 sections, 15 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: An example of KirchhoffNet with $N=7$. The dashed grey lines represent fixed capacitive branches, while the solid black lines represent learnable non-linear branches. The node voltages $\mathbf{v}$ at $t=0$ and $t=T$ are respectively taken as the input and the output.
  • Figure 2: An example of a 3-layer KirchhoffNet with a 7-dimensional input and a 6-dimensional output. Note that at $t=T$ and $2T$, KirchhoffNet topology can change, such as the addition or removal of nodes and modifications to edge connections. In the case of new nodes being added, their associated node voltages are initialized to zero (e.g., $v_8(T)=v_9(T)=0$). The preceding Fig. \ref{['fig:schematic']} depicts a 1-layer KirchhoffNet with a 7-dimensional input and output.
  • Figure 3: An illustration of a fully connected (FC) layer (left) and a neighbor-emphasizing (NE) layer (right) in a KirchhoffNet. Red edges denote connections from nodes with smaller indices to those with larger indices, while blue edges represent the opposite. In the right figure, a kernel slides across the given grid mesh, and for every possible kernel position, all nodes inside the kernel are fully connected. This results in edges between two nodes if they lie within a kernel. For instance, $n_1$ and $n_2$ are connected twice, while there is no connection between $n_1$ and $n_9$.
  • Figure 4: Simplified code implementations of a fully connected layer (FC layer), a neighbor emphasizing layer (NE layer), and a projection layer (Proj layer).
  • Figure 5: Left: The schematic of a composite device made up of a conductance, a current source, and a one-sided switch, to realize the function $g$ shown in Eq. (\ref{['eq:non_linear_iv_relu2']}). Right: Two voltage-controlled current sources (VCCSs), an independent current source, and a one-sided switch are needed to realize the function function $g$ shown in Eq. (\ref{['eq:non_linear_iv_relu3']}). Note that the one-sided switch here is ideal, completely cutting off the current flowing from $n_d$ to $n_s$ while allowing the current to flow from $n_s$ to $n_d$.
  • ...and 5 more figures