Riemannian Geometric-based Meta Learning
JuneYoung Park, YuMi Lee, Tae-Joon Kim, Jang-Hwan Choi
TL;DR
Stiefel-MAML proposes optimizing meta-learning parameters on the Stiefel manifold $St(n,p)$ to enforce orthogonality constraints and better reflect the geometry of the loss surface, using Riemannian gradient descent with retraction and a kernel-based loss on the manifold. The method demonstrates consistent improvements over MAML and First-Order MAML across 1- and 5-shot tasks on Omniglot, Mini-ImageNet, FC-100, and CUB, including cross-domain scenarios and compatibility with ResNet-50 backbones, with only modest computational overhead. Gradient-analysis indicates steeper adaptation trajectories, suggesting faster convergence to task-specific optima in the curved parameter space. Overall, the work highlights the practical value of non-Euclidean optimization in meta-learning and motivates further exploration of geometric structures to enhance rapid adaptation with limited data.
Abstract
Meta-learning, or "learning to learn," aims to enable models to quickly adapt to new tasks with minimal data. While traditional methods like Model-Agnostic Meta-Learning (MAML) optimize parameters in Euclidean space, they often struggle to capture complex learning dynamics, particularly in few-shot learning scenarios. To address this limitation, we propose Stiefel-MAML, which integrates Riemannian geometry by optimizing within the Stiefel manifold, a space that naturally enforces orthogonality constraints. By leveraging the geometric structure of the Stiefel manifold, we improve parameter expressiveness and enable more efficient optimization through Riemannian gradient calculations and retraction operations. We also introduce a novel kernel-based loss function defined on the Stiefel manifold, further enhancing the model's ability to explore the parameter space. Experimental results on benchmark datasets--including Omniglot, Mini-ImageNet, FC-100, and CUB--demonstrate that Stiefel-MAML consistently outperforms traditional MAML, achieving superior performance across various few-shot learning tasks. Our findings highlight the potential of Riemannian geometry to enhance meta-learning, paving the way for future research on optimizing over different geometric structures.
