Symbolic Representation for Any-to-Any Generative Tasks
Jiaqi Chen, Xiaoye Zhu, Yue Wang, Tianyang Liu, Xinhui Chen, Ying Chen, Chak Tou Leong, Yifei Ke, Joseph Liu, Yiwen Yuan, Julian McAuley, Li-jia Li
TL;DR
The paper tackles the challenge of enabling any-to-any generative tasks across modalities without task-specific training. It introduces A-Language, a symbolic representation that decomposes tasks into functions, parameters, and topology, and a training-free LM-driven inference engine to map natural language instructions to executable symbolic workflows. Empirical results on 120 real-world tasks and ComfyBench show competitive performance with state-of-the-art unified models, while delivering enhanced editability, interpretability, and efficiency. The work argues for the value of explicit symbolic task representations as a cost-effective, extensible foundation for advancing cross-modal generative AI, with robust topology construction and iterative refinement as key enablers of practical deployment.
Abstract
We propose a symbolic generative task description language and a corresponding inference engine capable of representing arbitrary multimodal tasks as structured symbolic flows. Unlike conventional generative models that rely on large-scale training and implicit neural representations to learn cross-modal mappings, often at high computational cost and with limited flexibility, our framework introduces an explicit symbolic representation comprising three core primitives: functions, parameters, and topological logic. Leveraging a pre-trained language model, our inference engine maps natural language instructions directly to symbolic workflows in a training-free manner. Our framework successfully performs over 12 diverse multimodal generative tasks, demonstrating strong performance and flexibility without the need for task-specific tuning. Experiments show that our method not only matches or outperforms existing state-of-the-art unified models in content quality, but also offers greater efficiency, editability, and interruptibility. We believe that symbolic task representations provide a cost-effective and extensible foundation for advancing the capabilities of generative AI.
