Occam's model: Selecting simpler representations for better transferability estimation
Prabhant Singh, Sibylle Hess, Joaquin Vanschoren
TL;DR
This work tackles pretrained-model selection for a given target task by introducing two transferability metrics that quantify representational simplicity: Pairwise Normalized Interclass Distance ($INT$) and Concept Variation. Grounded in a clustering/separability lens and an irregularity of label distributions, the approach provides a nearest-centroid/LDA interpretation and connections to neural collapse, yielding a practical transferability score that correlates with fine-tuning performance. Across extensive image-classification, limited-data, and self-supervised learning benchmarks, the proposed metrics often outperform state-of-the-art SITE/SDTE baselines, with Kendall's tau improvements reaching up to ~32% and favorable wall-clock efficiency. The methods are scalable, applicable to large model zoos, and show promise for broader transfer-learning settings, with future work expanding to detection, segmentation, and depth-estimation tasks.
Abstract
Fine-tuning models that have been pre-trained on large datasets has become a cornerstone of modern machine learning workflows. With the widespread availability of online model repositories, such as Hugging Face, it is now easier than ever to fine-tune pre-trained models for specific tasks. This raises a critical question: which pre-trained model is most suitable for a given task? This problem is called transferability estimation. In this work, we introduce two novel and effective metrics for estimating the transferability of pre-trained models. Our approach is grounded in viewing transferability as a measure of how easily a pre-trained model's representations can be trained to separate target classes, providing a unique perspective on transferability estimation. We rigorously evaluate the proposed metrics against state-of-the-art alternatives across diverse problem settings, demonstrating their robustness and practical utility. Additionally, we present theoretical insights that explain our metrics' efficacy and adaptability to various scenarios. We experimentally show that our metrics increase Kendall's Tau by up to 32% compared to the state-of-the-art baselines.
