Table of Contents
Fetching ...

Path-Decoupled Hyperbolic Flow Matching for Few-Shot Adaptation

Lin Li, Ziqi Jiang, Gefan Ye, Zhenqi He, Jiahui Li, Jun Xiao, Kwang-Ting Cheng, Long Chen

TL;DR

This work argues that Euclidean-based FM overlooks fundamental limitations of flat geometry, where polynomial volume growth fails to accommodate diverse feature distributions, leading to severe path entanglement, and proposes path-decoupled Hyperbolic Flow Matching, leveraging the Lorentz manifold's exponential expansion for trajectory decoupling.

Abstract

Recent advances in cross-modal few-shot adaptation treat visual-semantic alignment as a continuous feature transport problem via Flow Matching (FM). However, we argue that Euclidean-based FM overlooks fundamental limitations of flat geometry, where polynomial volume growth fails to accommodate diverse feature distributions, leading to severe path entanglement. To this end, we propose path-decoupled Hyperbolic Flow Matching (HFM), leveraging the Lorentz manifold's exponential expansion for trajectory decoupling. HFM structures the transport via two key designs: 1) Centripetal hyperbolic alignment: It constructs a centripetal hierarchy by anchoring textual roots, which pushes visual leaves to the boundary to initialize orderly flows. 2) Path-decoupled objective: It acts as a ``semantic guardrail'' rigidly confining trajectories within isolated class-specific geodesic corridors via step-wise supervision. Furthermore, we devise an adaptive diameter-based stopping to prevent over-transportation into the crowded origin based on the intrinsic semantic scale. Extensive ablations on 11 benchmarks have shown that HFM establishes a new state-of-the-art, consistently outperforming its Euclidean counterparts. Our codes and models will be released.

Path-Decoupled Hyperbolic Flow Matching for Few-Shot Adaptation

TL;DR

This work argues that Euclidean-based FM overlooks fundamental limitations of flat geometry, where polynomial volume growth fails to accommodate diverse feature distributions, leading to severe path entanglement, and proposes path-decoupled Hyperbolic Flow Matching, leveraging the Lorentz manifold's exponential expansion for trajectory decoupling.

Abstract

Recent advances in cross-modal few-shot adaptation treat visual-semantic alignment as a continuous feature transport problem via Flow Matching (FM). However, we argue that Euclidean-based FM overlooks fundamental limitations of flat geometry, where polynomial volume growth fails to accommodate diverse feature distributions, leading to severe path entanglement. To this end, we propose path-decoupled Hyperbolic Flow Matching (HFM), leveraging the Lorentz manifold's exponential expansion for trajectory decoupling. HFM structures the transport via two key designs: 1) Centripetal hyperbolic alignment: It constructs a centripetal hierarchy by anchoring textual roots, which pushes visual leaves to the boundary to initialize orderly flows. 2) Path-decoupled objective: It acts as a ``semantic guardrail'' rigidly confining trajectories within isolated class-specific geodesic corridors via step-wise supervision. Furthermore, we devise an adaptive diameter-based stopping to prevent over-transportation into the crowded origin based on the intrinsic semantic scale. Extensive ablations on 11 benchmarks have shown that HFM establishes a new state-of-the-art, consistently outperforming its Euclidean counterparts. Our codes and models will be released.
Paper Structure (12 sections, 10 equations, 3 figures, 4 tables, 2 algorithms)

This paper contains 12 sections, 10 equations, 3 figures, 4 tables, 2 algorithms.

Figures (3)

  • Figure 1: Illustration of Path Entanglement. (a) Euclidean Flow Matching suffers from severe trajectory collisions (e.g., "cat" intersecting "tiger" and "dog" merging with "cat") due to the limited polynomial capacity of flat geometry. (b) Hyperbolic Flow Matching uses exponential volume expansion to achieve path decoupling.
  • Figure 2: The overview of HFM. (a) Constructing Centripetal Hyperbolic Space: Establish a centripetal cross-modal hierarchy, optimizing textual roots near the origin and visual leaves toward the boundary. (b) Learning Path-Decoupled Flows: Tangent velocity fields $\mathcal{F}_\theta$ are optimized to guide features along isolated geodesic corridors. The step-wise consistency enforces trajectory decoupling in hyperbolic space. (c) Inference with Diameter-based Stopping: The flow terminates at $t^*$ once the distance to text prototypes drops below a dynamic threshold scaled by semantic diameter $d_{\text{txt}}$ to alleviate over-transportation into the crowded origin.
  • Figure 3: Qualitative Results. Visualization of transport trajectories (§\ref{['sec:vis']}). Top: Euclidean flows suffer from severe path entanglement, exhibiting chaotic and intersecting paths due to spatial crowding. Bottom: HFM achieves path decoupling via a centripetal hierarchy. Visual features move centripetally from the boundary to central text roots along isolated geodesic corridors.