Solving Large-scale Spatial Problems with Convolutional Neural Networks
Damian Owerko, Charilaos I. Kanatsoulis, Alejandro Ribeiro
TL;DR
The paper tackles large-scale spatial problems by exploiting CNN shift-equivariance and transfer learning to train on small signal windows while evaluating on much larger inputs. A theoretical bound is derived showing that a CNN trained with windowed data has controlled generalization error when applied to larger signals, with the bound depending on the network’s filter norms, depth, and window sizes. Spatial problems are recast as image-to-image regression by representing point sets with Gaussian mixtures and sampling onto grids for CNN processing. The framework is validated on mobile infrastructure on demand (MID), achieving zero-shot scalability to hundreds of agents and window sizes up to $1600$ meters with linear-time complexity in the area, surpassing previous convex-optimization approaches in scalability. Together, these results establish a principled, efficient route for solving large-scale spatial tasks using CNNs and transfer learning.
Abstract
Over the past decade, deep learning research has been accelerated by increasingly powerful hardware, which facilitated rapid growth in the model complexity and the amount of data ingested. This is becoming unsustainable and therefore refocusing on efficiency is necessary. In this paper, we employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation, and provide a theoretical bound on the resulting generalization error. Our proof leverages shift-equivariance of CNNs, a property that is underexploited in transfer learning. The theoretical results are experimentally supported in the context of mobile infrastructure on demand (MID). The proposed approach is able to tackle MID at large scales with hundreds of agents, which was computationally intractable prior to this work.
