Table of Contents
Fetching ...

Geometric Algebra Planes: Convex Implicit Neural Volumes

Irmak Sivgin, Sara Fridovich-Keil, Gordon Wetzstein, Mert Pilanci

TL;DR

In the 2D setting, it is proved that GA-Planes is equivalent to a low-rank plus low-resolution matrix factorization; it is shown that this approximation outperforms the classic low-rank plus sparse decomposition for fitting a natural image.

Abstract

Volume parameterizations abound in recent literature, from the classic voxel grid to the implicit neural representation and everything in between. While implicit representations have shown impressive capacity and better memory efficiency compared to voxel grids, to date they require training via nonconvex optimization. This nonconvex training process can be slow to converge and sensitive to initialization and hyperparameter choices that affect the final converged result. We introduce a family of models, GA-Planes, that is the first class of implicit neural volume representations that can be trained by convex optimization. GA-Planes models include any combination of features stored in tensor basis elements, followed by a neural feature decoder. They generalize many existing representations and can be adapted for convex, semiconvex, or nonconvex training as needed for different inverse problems. In the 2D setting, we prove that GA-Planes is equivalent to a low-rank plus low-resolution matrix factorization; we show that this approximation outperforms the classic low-rank plus sparse decomposition for fitting a natural image. In 3D, we demonstrate GA-Planes' competitive performance in terms of expressiveness, model size, and optimizability across three volume fitting tasks: radiance field reconstruction, 3D segmentation, and video segmentation.

Geometric Algebra Planes: Convex Implicit Neural Volumes

TL;DR

In the 2D setting, it is proved that GA-Planes is equivalent to a low-rank plus low-resolution matrix factorization; it is shown that this approximation outperforms the classic low-rank plus sparse decomposition for fitting a natural image.

Abstract

Volume parameterizations abound in recent literature, from the classic voxel grid to the implicit neural representation and everything in between. While implicit representations have shown impressive capacity and better memory efficiency compared to voxel grids, to date they require training via nonconvex optimization. This nonconvex training process can be slow to converge and sensitive to initialization and hyperparameter choices that affect the final converged result. We introduce a family of models, GA-Planes, that is the first class of implicit neural volume representations that can be trained by convex optimization. GA-Planes models include any combination of features stored in tensor basis elements, followed by a neural feature decoder. They generalize many existing representations and can be adapted for convex, semiconvex, or nonconvex training as needed for different inverse problems. In the 2D setting, we prove that GA-Planes is equivalent to a low-rank plus low-resolution matrix factorization; we show that this approximation outperforms the classic low-rank plus sparse decomposition for fitting a natural image. In 3D, we demonstrate GA-Planes' competitive performance in terms of expressiveness, model size, and optimizability across three volume fitting tasks: radiance field reconstruction, 3D segmentation, and video segmentation.

Paper Structure

This paper contains 46 sections, 4 theorems, 43 equations, 19 figures, 6 tables.

Key Result

Theorem 1

The two-dimensional representation $\text{D} {(\mathbf{e}_1 + \mathbf{e}_2)}$ with linear decoder $\text{D} {(f(q))} = \alpha^T f(q)$ is equivalent to a low-rank matrix completion model with the following structure: These two models are equivalent in the sense that $U^* = \mathbf{g}_1^* \text{diag}(\alpha^*)$ and $V^* = \mathbf{g}_2^* \text{diag}(\alpha^*)$ where $U^*, V^*$ is the optimal s

Figures (19)

  • Figure 1: Overview of the GA-Planes models we use in our experiments. Our nonconvex model (top) uses a standard MLP decoder and multiplication of features when the result yields a volume under geometric algebra; it also concatenates features across mult-resolution grids. Our semiconvex (middle) and convex (bottom) models use a single resolution for each feature grid, and avoid multiplication of features since that would induce nonconvexity. The pastel-colored grids inside the indicator function of the convex model are frozen at initialization and used as fixed ReLU gating patterns. $\odot$ denotes concatenation and $\circ$ denotes elementwise multiplication.
  • Figure 2: 2D image fitting experiments with the astronaut image from SciPy, validating matrix completion analysis summarized in \ref{['tab:ranks']}. We compare 2D GA-Planes models of the form $\text{D} {(\mathbf{e}_1 \circ \mathbf{e}_2)}$ (solid colorful lines) and $\text{D} {(\mathbf{e}_1 + \mathbf{e}_2)}$ (dotted colorful lines) with the optimal low-rank approximation provided by singular value decomposition (solid black line).
  • Figure 3: For a natural image, approximation as a sum of low rank and low resolution components (green points and subfigure b) achieves higher fidelity compared to the classic matrix decomposition as a sum of low rank and sparse components (blue points and subfigure c), with the same parameter budget (18.75% of the original image size, for subfigures b and c). The GA-Planes model family generalizes the idea of a low rank plus low resolution approximation to three dimensions.
  • Figure 4: Results on radiance field reconstruction. Nonconvex GA-Planes (with feature multiplication) offers the most efficient representation: when the model is large it performs comparably to the state of the art models, but when model size is reduced it retains higher performance than other models. Here all models are trained for the same number of epochs on all 8 scenes from the Blender dataset, and the average results are shown.
  • Figure 5: Rendering comparison for the chair scene: TensoRF on the left (0.32 M parameters), K-Planes in the middle (0.39 M parameters), GA-Planes on the right (0.25 M parameters).
  • ...and 14 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4