Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships
Rangel Daroya, Aaron Sun, Subhransu Maji
TL;DR
Task2Box introduces axis-aligned box embeddings to model asymmetric relationships between tasks and datasets, addressing limitations of symmetric Euclidean representations. By learning a mapping from base task representations (e.g., CLIP, Task2Vec, or attribute vectors) to low-dimensional boxes, the method uses volumetric overlaps to encode containment and transfer affinities. Across iNaturalist+CUB, ImageNet, and Taskonomy, Task2Box outperforms baselines and generalizes to novel tasks, while providing interpretable visualizations of task spaces and dataset relationships. The approach supports dataset discovery and transfer planning, with potential extensions to additional modalities and richer task descriptors via datasheets and model cards.
Abstract
Modeling and visualizing relationships between tasks or datasets is an important step towards solving various meta-tasks such as dataset discovery, multi-tasking, and transfer learning. However, many relationships, such as containment and transferability, are naturally asymmetric and current approaches for representation and visualization (e.g., t-SNE) do not readily support this. We propose Task2Box, an approach to represent tasks using box embeddings -- axis-aligned hyperrectangles in low dimensional spaces -- that can capture asymmetric relationships between them through volumetric overlaps. We show that Task2Box accurately predicts unseen hierarchical relationships between nodes in ImageNet and iNaturalist datasets, as well as transferability between tasks in the Taskonomy benchmark. We also show that box embeddings estimated from task representations (e.g., CLIP, Task2Vec, or attribute based) can be used to predict relationships between unseen tasks more accurately than classifiers trained on the same representations, as well as handcrafted asymmetric distances (e.g., KL divergence). This suggests that low-dimensional box embeddings can effectively capture these task relationships and have the added advantage of being interpretable. We use the approach to visualize relationships among publicly available image classification datasets on popular dataset hosting platform called Hugging Face.
