Unsupervised Discovery of Steerable Factors When Graph Deep Generative Models Are Entangled
Shengchao Liu, Chengpeng Wang, Jiarui Lu, Weili Nie, Hanchen Wang, Zhuoxinran Li, Bolei Zhou, Jian Tang
TL;DR
The paper tackles unsupervised graph editing for pretrained graph deep generative models, identifying entangled latent spaces as a barrier to controllable graph generation. It introduces GraphCG, a model-agnostic framework that learns semantic directions by maximizing mutual information within an energy-based model, producing sequences of edited graphs without labeled factors. Key contributions include a formal MI-based learning objective, an EBMs-based implementation with a NoICE variant (GraphCG-NCE), extensive experiments on molecular graphs and point clouds, and qualitative demonstrations of seven steerable factors across multiple datasets. The work offers a practical pathway for controllable graph generation with limited supervision, enabling scalable editing of molecular structures and 3D shapes for applications in chemistry and geometry processing.
Abstract
Deep generative models (DGMs) have been widely developed for graph data. However, much less investigation has been carried out on understanding the latent space of such pretrained graph DGMs. These understandings possess the potential to provide constructive guidelines for crucial tasks, such as graph controllable generation. Thus in this work, we are interested in studying this problem and propose GraphCG, a method for the unsupervised discovery of steerable factors in the latent space of pretrained graph DGMs. We first examine the representation space of three pretrained graph DGMs with six disentanglement metrics, and we observe that the pretrained representation space is entangled. Motivated by this observation, GraphCG learns the steerable factors via maximizing the mutual information between semantic-rich directions, where the controlled graph moving along the same direction will share the same steerable factors. We quantitatively verify that GraphCG outperforms four competitive baselines on two graph DGMs pretrained on two molecule datasets. Additionally, we qualitatively illustrate seven steerable factors learned by GraphCG on five pretrained DGMs over five graph datasets, including two for molecules and three for point clouds.
