Table of Contents
Fetching ...

Riemannian Geometric-based Meta Learning

JuneYoung Park, YuMi Lee, Tae-Joon Kim, Jang-Hwan Choi

TL;DR

Stiefel-MAML proposes optimizing meta-learning parameters on the Stiefel manifold $St(n,p)$ to enforce orthogonality constraints and better reflect the geometry of the loss surface, using Riemannian gradient descent with retraction and a kernel-based loss on the manifold. The method demonstrates consistent improvements over MAML and First-Order MAML across 1- and 5-shot tasks on Omniglot, Mini-ImageNet, FC-100, and CUB, including cross-domain scenarios and compatibility with ResNet-50 backbones, with only modest computational overhead. Gradient-analysis indicates steeper adaptation trajectories, suggesting faster convergence to task-specific optima in the curved parameter space. Overall, the work highlights the practical value of non-Euclidean optimization in meta-learning and motivates further exploration of geometric structures to enhance rapid adaptation with limited data.

Abstract

Meta-learning, or "learning to learn," aims to enable models to quickly adapt to new tasks with minimal data. While traditional methods like Model-Agnostic Meta-Learning (MAML) optimize parameters in Euclidean space, they often struggle to capture complex learning dynamics, particularly in few-shot learning scenarios. To address this limitation, we propose Stiefel-MAML, which integrates Riemannian geometry by optimizing within the Stiefel manifold, a space that naturally enforces orthogonality constraints. By leveraging the geometric structure of the Stiefel manifold, we improve parameter expressiveness and enable more efficient optimization through Riemannian gradient calculations and retraction operations. We also introduce a novel kernel-based loss function defined on the Stiefel manifold, further enhancing the model's ability to explore the parameter space. Experimental results on benchmark datasets--including Omniglot, Mini-ImageNet, FC-100, and CUB--demonstrate that Stiefel-MAML consistently outperforms traditional MAML, achieving superior performance across various few-shot learning tasks. Our findings highlight the potential of Riemannian geometry to enhance meta-learning, paving the way for future research on optimizing over different geometric structures.

Riemannian Geometric-based Meta Learning

TL;DR

Stiefel-MAML proposes optimizing meta-learning parameters on the Stiefel manifold to enforce orthogonality constraints and better reflect the geometry of the loss surface, using Riemannian gradient descent with retraction and a kernel-based loss on the manifold. The method demonstrates consistent improvements over MAML and First-Order MAML across 1- and 5-shot tasks on Omniglot, Mini-ImageNet, FC-100, and CUB, including cross-domain scenarios and compatibility with ResNet-50 backbones, with only modest computational overhead. Gradient-analysis indicates steeper adaptation trajectories, suggesting faster convergence to task-specific optima in the curved parameter space. Overall, the work highlights the practical value of non-Euclidean optimization in meta-learning and motivates further exploration of geometric structures to enhance rapid adaptation with limited data.

Abstract

Meta-learning, or "learning to learn," aims to enable models to quickly adapt to new tasks with minimal data. While traditional methods like Model-Agnostic Meta-Learning (MAML) optimize parameters in Euclidean space, they often struggle to capture complex learning dynamics, particularly in few-shot learning scenarios. To address this limitation, we propose Stiefel-MAML, which integrates Riemannian geometry by optimizing within the Stiefel manifold, a space that naturally enforces orthogonality constraints. By leveraging the geometric structure of the Stiefel manifold, we improve parameter expressiveness and enable more efficient optimization through Riemannian gradient calculations and retraction operations. We also introduce a novel kernel-based loss function defined on the Stiefel manifold, further enhancing the model's ability to explore the parameter space. Experimental results on benchmark datasets--including Omniglot, Mini-ImageNet, FC-100, and CUB--demonstrate that Stiefel-MAML consistently outperforms traditional MAML, achieving superior performance across various few-shot learning tasks. Our findings highlight the potential of Riemannian geometry to enhance meta-learning, paving the way for future research on optimizing over different geometric structures.

Paper Structure

This paper contains 22 sections, 4 equations, 2 figures, 4 tables, 1 algorithm.

Figures (2)

  • Figure 1: Diagram of the Stiefel-MAML algorithm, which the adaptation of the traditional MAML method within a Riemannian manifold, enabling more efficient navigation and optimization on fixed-manifold (M).
  • Figure 2: Gradient norm results for each task, which the comparative performance between Stiefel(S)-MAML and MAML. Higher gradient norms observed in S-MAML indicate steeper movements along the loss surface, suggesting more effective task adaptation.; x-axis: Adaptation steps; y-axis: Gradient norms