Table of Contents
Fetching ...

Cellular Learning: Scattered Data Regression in High Dimensions via Voronoi Cells

Shankar Prasad Sastry

TL;DR

The paper tackles regression of scattered data in high dimensions by introducing cellular learning, a method that blends per-cell linear functions centered at seed vertices to produce a continuous, piecewise-smooth regression without explicit Voronoi diagram computation. It leverages Lloyd's algorithm for seed initialization and Adam optimization to fit parameters, with regularization on coefficients and blending to improve generalization. Empirical results on MNIST with one-vs-rest classification achieve up to 98.20% accuracy using 46 cells and 722k parameters, demonstrating scalability and competitive performance without augmentation. The work offers a scalable, interpretable alternative to nonlinear models and outlines avenues for hierarchical and hyperspherical extensions to further improve efficiency and expressiveness.

Abstract

I present a regression algorithm that provides a continuous, piecewise-smooth function approximating scattered data. It is based on composing and blending linear functions over Voronoi cells, and it scales to high dimensions. The algorithm infers Voronoi cells from seed vertices and constructs a linear function for the input data in and around each cell. As the algorithm does not explicitly compute the Voronoi diagram, it avoids the curse of dimensionality. An accuracy of around 98.2% on the MNIST dataset with 722,200 degrees of freedom (without data augmentation, convolution, or other geometric operators) demonstrates the applicability and scalability of the algorithm.

Cellular Learning: Scattered Data Regression in High Dimensions via Voronoi Cells

TL;DR

The paper tackles regression of scattered data in high dimensions by introducing cellular learning, a method that blends per-cell linear functions centered at seed vertices to produce a continuous, piecewise-smooth regression without explicit Voronoi diagram computation. It leverages Lloyd's algorithm for seed initialization and Adam optimization to fit parameters, with regularization on coefficients and blending to improve generalization. Empirical results on MNIST with one-vs-rest classification achieve up to 98.20% accuracy using 46 cells and 722k parameters, demonstrating scalability and competitive performance without augmentation. The work offers a scalable, interpretable alternative to nonlinear models and outlines avenues for hierarchical and hyperspherical extensions to further improve efficiency and expressiveness.

Abstract

I present a regression algorithm that provides a continuous, piecewise-smooth function approximating scattered data. It is based on composing and blending linear functions over Voronoi cells, and it scales to high dimensions. The algorithm infers Voronoi cells from seed vertices and constructs a linear function for the input data in and around each cell. As the algorithm does not explicitly compute the Voronoi diagram, it avoids the curse of dimensionality. An accuracy of around 98.2% on the MNIST dataset with 722,200 degrees of freedom (without data augmentation, convolution, or other geometric operators) demonstrates the applicability and scalability of the algorithm.

Paper Structure

This paper contains 18 sections, 10 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: An example of a Voronoi diagram in 2D for a given set of vertices. Note that every line segment or ray (dashed lines) in the diagram is a segment from the perpendicular bisector of a pair of vertices.
  • Figure 2: Given a point $\bm{p}$, how can one find its distance from the boundary of the Voronoi cell of vertex $\bm{c_0}$? The solution is to consider other vertices ($\bm{c_1}$, $\bm{c_2}$, and $\bm{c_3}$), and find the hyperplanes that perpendicularly bisect $\bm{c_0}$ and the other vertex. In the figures above, the dashed lines are the hyperplanes. We find the distance from $\bm{p}$ to the hyperplane along the line joining $\bm{p}$ and $\bm{c_0}$. This distance can be found by solving a linear equation. We should find the shortest distance from $c_0$ for which the point of intersection is on line segment $\bm{pc_0}$. The distances in (a) and (b) are both considered as they are on the line segment. In (c), the point $\bm{q_3}$ is not on the line segment $\bm{pc_0}$, so it is discarded. Since $\left|\bm{c_0q_2}\right| < \left|\bm{c_0q_1}\right|$, $\left|\bm{pq_2}\right|$ is considered the distance from $\bm{p}$ to the boundary of the Voronoi cell of $\bm{c_0}$.
  • Figure 3: Function blending: The vertex $\bm{c_0}$ is one of the seed vertices and the solid polygon around it is the boundary of the Voronoi cell of $\bm{c_0}$. Inside the cell, the relative weight of the linear function associated with the cell is $1$. The relative weight gradually reduces to $0$ as we move toward the dashed outer polygon. Outside the dashed outer polygon, the relative weight is $0$. For example, the weight is $1$ on the line segment $\bm{c_0x}$, and it linearly reduces to $0$ along the line segment $\bm{xx'}$. Note that $\left\|\bm{xx'}\right\| = \alpha_i \left\|\bm{c_0x}\right\|$, where $\alpha_i$ is the blending parameter.