LoFi: Neural Local Fields for Scalable Image Reconstruction
AmirEhsan Khorashadizadeh, Tobías I. Liaudat, Tianlin Liu, Jason D. McEwen, Ivan Dokmanić
TL;DR
LoFi addresses scalable inverse-problem imaging by using a coordinate-based local-field approach that predicts each pixel from a local neighborhood via an MLP. It achieves continuous, resolution-agnostic outputs with memory usage largely independent of image size, enabling high-resolution training on modest hardware. Architectural components include learnable patch geometry, differentiable patch extraction, coordinate-conditioned patch localization (CCPG), and a learnable Fourier-domain noise filter, with extensions INR-LoFi and LoFi-ADMM. Empirically, LoFi matches or surpasses CNNs and ViTs on LDCT denoising and dark matter mapping, generalizes well to out-of-distribution data, and supports high-resolution reconstruction with reduced memory and flexible plug-and-play priors.
Abstract
Neural fields or implicit neural representations (INRs) have attracted significant attention in computer vision and imaging due to their efficient coordinate-based representation of images and 3D volumes. In this work, we introduce a coordinate-based framework for solving imaging inverse problems, termed LoFi (Local Field). Unlike conventional methods for image reconstruction, LoFi processes local information at each coordinate separately by multi-layer perceptrons (MLPs), recovering the object at that specific coordinate. Similar to INRs, LoFi can recover images at any continuous coordinate, enabling image reconstruction at multiple resolutions. With comparable or better performance than standard deep learning models like convolutional neural networks (CNNs) and vision transformers (ViTs), LoFi achieves excellent generalization to out-of-distribution data with memory usage almost independent of image resolution. Remarkably, training on 1024x1024 images requires less than 200MB of memory -- much below standard CNNs and ViTs. Additionally, LoFi's local design allows it to train on extremely small datasets with 10 samples or fewer, without overfitting and without the need for explicit regularization or early stopping.
