OmniXtreme: Breaking the Generality Barrier in High-Dynamic Humanoid Control

Yunshen Wang; Shaohang Zhu; Peiyuan Zhi; Yuhan Li; Jiaxin Li; Yong-Lu Li; Yuchen Xiao; Xingxing Wang; Baoxiong Jia; Siyuan Huang

OmniXtreme: Breaking the Generality Barrier in High-Dynamic Humanoid Control

Yunshen Wang, Shaohang Zhu, Peiyuan Zhi, Yuhan Li, Jiaxin Li, Yong-Lu Li, Yuchen Xiao, Xingxing Wang, Baoxiong Jia, Siyuan Huang

TL;DR

OmniXtreme is introduced, a scalable framework that decouples general motor skill learning from sim-to-real physical skill refinement, followed by an actuation-aware refinement phase that ensures robust performance on physical hardware.

Abstract

High-fidelity motion tracking serves as the ultimate litmus test for generalizable, human-level motor skills. However, current policies often hit a "generality barrier": as motion libraries scale in diversity, tracking fidelity inevitably collapses - especially for real-world deployment of high-dynamic motions. We identify this failure as the result of two compounding factors: the learning bottleneck in scaling multi-motion optimization and the physical executability constraints that arise in real-world actuation. To overcome these challenges, we introduce OmniXtreme, a scalable framework that decouples general motor skill learning from sim-to-real physical skill refinement. Our approach uses a flow-matching policy with high-capacity architectures to scale representation capacity without interference-intensive multi-motion RL optimization, followed by an actuation-aware refinement phase that ensures robust performance on physical hardware. Extensive experiments demonstrate that OmniXtreme maintains high-fidelity tracking across diverse, high-difficulty datasets. On real robots, the unified policy successfully executes multiple extreme motions, effectively breaking the long-standing fidelity-scalability trade-off in high-dynamic humanoid control.

OmniXtreme: Breaking the Generality Barrier in High-Dynamic Humanoid Control

TL;DR

Abstract

Paper Structure (57 sections, 24 equations, 5 figures, 12 tables, 1 algorithm)

This paper contains 57 sections, 24 equations, 5 figures, 12 tables, 1 algorithm.

Introduction
Related Work
Humanoid Whole-body Control and General Tracking
Diffusion and Flow-based Action Modeling for Robotic Planning and Control
Actuation-aware Agile Robotic Control
Methodology
Scalable Flow-based Policy Pretraining
Problem Formulation
Expert Policy Learning
Flow-matching Policy Learning
Fidelity-Preserving Randomization and Noise
Actuation-Aware Post-training Phase
Residual Policy Modeling
Actuation-aware Physical Constraint Modeling
Aggressive Domain Randomization
...and 42 more sections

Figures (5)

Figure 1: Extreme whole-body humanoid control from our unified policy OmniXtreme. (a) A quantitative comparison shows that our curated extreme motion libraries occupy substantially more challenging regimes than standard multi-motion benchmarks (e.g., Unitree-retargeted LAFAN1). Real-world executions of our unified policy OmniXtreme demonstrate robust and physically executable extreme behaviors drawn from this motion set, including (b) extreme balance behaviors, (c) rapid contact switching with complex support transitions, (d) high-speed motions with large angular velocities, and (e) diverse whole-body behaviors spanning qualitatively distinct motion styles.
Figure 2: Overview of the OmniXtreme. (a) Pretraining phase: A unified base policy is trained via DAgger-based Flow Matching to aggregate diverse motion priors from different motion tracking experts. (b) Post-training phase: The base policy is frozen while a residual policy is optimized under stringent motor constraints, extensive domain randomization, and power-safety regularization to bridge the sim-to-real gap. (c) Onboard deployment: The whole inference pipeline is real-time and executed entirely onboard, facilitating robust and agile control in physical environments.
Figure 3: Fidelity--scalability trade-off. Tracking success rate as we progressively scale motion diversity and difficulty, while evaluating all policies on a fixed set of the first 10 motions.
Figure 4: Capacity scaling. Tracking fidelity and robustness as a function of model capacity. OmniXtreme benefit more strongly from scaling, while conventional MLP controllers saturate earlier.
Figure 5: Qualitative results. Representative real-world rollouts produced by OmniXtreme, executing qualitatively distinct whole-body motions across diverse styles and contact patterns, including flips, acrobatics, breakdance, and martial-arts behaviors. The results illustrate stable and coordinated execution under rapid contact transitions and timing-sensitive phases on physical hardware.

OmniXtreme: Breaking the Generality Barrier in High-Dynamic Humanoid Control

TL;DR

Abstract

OmniXtreme: Breaking the Generality Barrier in High-Dynamic Humanoid Control

Authors

TL;DR

Abstract

Table of Contents

Figures (5)