HOFAR: High-Order Augmentation of Flow Autoregressive Transformers
Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Mingda Wan
TL;DR
Problem: improve fidelity and long-range coherence in flow-based autoregressive image generation. Approach: introduce HOFAR to incorporate high-order trajectory supervision into FlowAR, accompanied by theoretical efficiency guarantees and empirical validation. Contributions: a formal framework for high-order dynamics, complexity bound $O(k m n^4 d^2)$, and CIFAR-10 experiments showing improved realism and coherence over FlowAR baselines. Significance: enables more realistic and coherent generation at scale and paves the way for multi-modal extensions and broader applicability of high-order trajectory modeling in generative systems.
Abstract
Flow Matching and Transformer architectures have demonstrated remarkable performance in image generation tasks, with recent work FlowAR [Ren et al., 2024] synergistically integrating both paradigms to advance synthesis fidelity. However, current FlowAR implementations remain constrained by first-order trajectory modeling during the generation process. This paper introduces a novel framework that systematically enhances flow autoregressive transformers through high-order supervision. We provide theoretical analysis and empirical evaluation showing that our High-Order FlowAR (HOFAR) demonstrates measurable improvements in generation quality compared to baseline models. The proposed approach advances the understanding of flow-based autoregressive modeling by introducing a systematic framework for analyzing trajectory dynamics through high-order expansion.
