Learning Disentangled Semantic Spaces of Explanations via Invertible Neural Networks
Yingji Zhang, Danilo S. Carvalho, André Freitas
TL;DR
The paper tackles the challenge of disentangling sentence semantics to enable localized and controllable generation. It introduces a flow-based invertible neural network (INN) added to a frozen transformer-based Autoencoder to map sentence representations into a smooth Gaussian latent space, with two training regimes: unsupervised and cluster-supervised around semantic role-content clusters derived from AST. Geometric data augmentation is used to reinforce separability, enabling precise interpolation and retrieval of role-content while preserving predicate-argument structure. Empirical results show that cluster-supervised INN yields superior disentanglement and more controllable generation than prior language-variational models, with high invertibility and smoother latent traversals. This work bridges distributional and formal semantics in NLP and opens avenues for safer, interpretable, and semantically controllable generation in explanation-centric tasks.
Abstract
Disentangled latent spaces usually have better semantic separability and geometrical properties, which leads to better interpretability and more controllable data generation. While this has been well investigated in Computer Vision, in tasks such as image disentanglement, in the NLP domain sentence disentanglement is still comparatively under-investigated. Most previous work have concentrated on disentangling task-specific generative factors, such as sentiment, within the context of style transfer. In this work, we focus on a more general form of sentence disentanglement, targeting the localised modification and control of more general sentence semantic features. To achieve this, we contribute to a novel notion of sentence semantic disentanglement and introduce a flow-based invertible neural network (INN) mechanism integrated with a transformer-based language Autoencoder (AE) in order to deliver latent spaces with better separability properties. Experimental results demonstrate that the model can conform the distributed latent space into a better semantically disentangled sentence space, leading to improved language interpretability and controlled generation when compared to the recent state-of-the-art language VAE models.
