Cliqueformer: Model-Based Optimization with Structured Transformers
Jakub Grudzien Kuba, Pieter Abbeel, Sergey Levine
TL;DR
This work addresses offline model-based optimization by exploiting the target function's structure through functional graphical models (FGMs). It introduces Cliqueformer, a transformer-based architecture that enforces a predefined FGM clique decomposition in a learned latent space and regularizes clique marginals with a variational information bottleneck, enabling robust design proposals. Across Latent Radial-Basis Functions, superconductors, TFBind-8, and DNA Enhancers, Cliqueformer achieves state-of-the-art performance without the need for explicit conservative penalties, demonstrating strong generalization to high-dimensional design tasks. By combining end-to-end FGM structure learning with scalable transformer components, the approach offers a practical pathway to applying deep models to complex design problems in chemistry and biology while mitigating distribution shift and enabling efficient optimization.
Abstract
Large neural networks excel at prediction tasks, but their application to design problems, such as protein engineering or materials discovery, requires solving offline model-based optimization (MBO) problems. While predictive models may not directly translate to effective design, recent MBO algorithms incorporate reinforcement learning and generative modeling approaches. Meanwhile, theoretical work suggests that exploiting the target function's structure can enhance MBO performance. We present Cliqueformer, a transformer-based architecture that learns the black-box function's structure through functional graphical models (FGM), addressing distribution shift without relying on explicit conservative approaches. Across various domains, including chemical and genetic design tasks, Cliqueformer demonstrates superior performance compared to existing methods.
