PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks
Philip Matthias Winter, Maria Wimmer, David Major, Dimitrios Lenis, Astrid Berg, Theresa Neubauer, Gaia Romana De Paolis, Johannes Novotny, Sophia Ulonska, Katja Bühler
TL;DR
PARMESAN addresses the need for flexible adaptation of dense prediction models without continual parameter updates. It introduces a parameter-free transduction framework that uses a frozen feature extractor, a memory module, and hierarchical memory search to predict labels, with an intra-query message passing refinement and a novelty-based sparsity scheme for memory efficiency. The approach enables rapid learning and unlearning via memory consolidation, demonstrated in continual learning setups where it achieves 3-4 orders of magnitude faster learning than baselines while maintaining competitive predictive performance and stable knowledge retention. The method generalizes across 1D, 2D, and 3D grid data and is demonstrated on semantic segmentation and depth estimation tasks, highlighting its practical impact for test-time learning, domain adaptation, and few-shot scenarios.
Abstract
This work addresses flexibility in deep learning by means of transductive reasoning. For adaptation to new data and tasks, e.g., in continual learning, existing methods typically involve tuning learnable parameters or complete re-training from scratch, rendering such approaches unflexible in practice. We argue that the notion of separating computation from memory by the means of transduction can act as a stepping stone for solving these issues. We therefore propose PARMESAN (parameter-free memory search and transduction), a scalable method which leverages a memory module for solving dense prediction tasks. At inference, hidden representations in memory are being searched to find corresponding patterns. In contrast to other methods that rely on continuous training of learnable parameters, PARMESAN learns via memory consolidation simply by modifying stored contents. Our method is compatible with commonly used architectures and canonically transfers to 1D, 2D, and 3D grid-based data. The capabilities of our approach are demonstrated at the complex task of continual learning. PARMESAN learns by 3-4 orders of magnitude faster than established baselines while being on par in terms of predictive performance, hardware-efficiency, and knowledge retention.
