A Strong Baseline for Molecular Few-Shot Learning
Philippe Formont, Hugo Jeannin, Pablo Piantanida, Ismail Ben Ayed
TL;DR
The paper addresses molecular few-shot learning under data scarcity by revisiting simple fine-tuning instead of meta-learning. It introduces a quadratic-probing classifier based on Mahalanobis distance with class prototypes $w_k$ and precision matrices $M_k$, optimized via block-coordinate descent with a shrinkage-regularized surrogate to prevent degenerate covariance growth, and it uses a multitask GNN backbone pretrained on FS-mol. On FS-mol and out-of-domain shifts, the quadratic probe (and the linear probe) yield competitive or superior performance compared to state-of-the-art meta-learning approaches, demonstrating robustness to domain shifts and applicability in black-box settings. The work also provides extensive ablations and domain-shift benchmarks, including imbalanced QSAR targets and large-scale HTS library screening, illustrating practical advantages of simple fine-tuning baselines. Overall, the proposed methods offer efficient, robust few-shot classifiers for drug discovery tasks, with the quadratic probe delivering the best average gains and strong resilience to distribution shifts, and the authors release their code for reproducibility.
Abstract
Few-shot learning has recently attracted significant interest in drug discovery, with a recent, fast-growing literature mostly involving convoluted meta-learning strategies. We revisit the more straightforward fine-tuning approach for molecular data, and propose a regularized quadratic-probe loss based on the the Mahalanobis distance. We design a dedicated block-coordinate descent optimizer, which avoid the degenerate solutions of our loss. Interestingly, our simple fine-tuning approach achieves highly competitive performances in comparison to state-of-the-art methods, while being applicable to black-box settings and removing the need for specific episodic pre-training strategies. Furthermore, we introduce a new benchmark to assess the robustness of the competing methods to domain shifts. In this setting, our fine-tuning baseline obtains consistently better results than meta-learning methods.
