Controllable Unlearning for Image-to-Image Generative Models via $\varepsilon$-Constrained Optimization
Xiaohua Feng, Yuyuan Li, Chaochao Chen, Li Zhang, Longfei Li, Jun Zhou, Xiaolin Zheng
TL;DR
The paper tackles privacy- and bias-related concerns in Image-to-Image (I2I) generative models by proposing a controllable unlearning framework that treats unlearning as an $\varepsilon$-constrained optimization problem. It reformulates the problem as bi-objective (unlearning on the forget set vs. retaining utility on the retain set) and derives two boundary Pareto-optimal solutions, guaranteeing a valid range for the control parameter $\varepsilon$ where Pareto optimality holds. A gradient-based solver with a phase-based two-step process identifies unlearning boundaries (Phase I) and then traces a Pareto path by relaxing the constraint (Phase II), with a control function $\psi(\theta)$ guiding convergence. The method is validated across MAE, VQ-GAN, and diffusion I2I models on ImageNet-1K and Places-365, achieving superior forgetting on the forget set while preserving retain-set performance, and enabling fine-grained control over the unlearning-utility trade-off. These results demonstrate a scalable, theoretically grounded approach to customizable unlearning in powerful I2I systems, with potential extensions to other generative domains.
Abstract
While generative models have made significant advancements in recent years, they also raise concerns such as privacy breaches and biases. Machine unlearning has emerged as a viable solution, aiming to remove specific training data, e.g., containing private information and bias, from models. In this paper, we study the machine unlearning problem in Image-to-Image (I2I) generative models. Previous studies mainly treat it as a single objective optimization problem, offering a solitary solution, thereby neglecting the varied user expectations towards the trade-off between complete unlearning and model utility. To address this issue, we propose a controllable unlearning framework that uses a control coefficient $\varepsilon$ to control the trade-off. We reformulate the I2I generative model unlearning problem into a $\varepsilon$-constrained optimization problem and solve it with a gradient-based method to find optimal solutions for unlearning boundaries. These boundaries define the valid range for the control coefficient. Within this range, every yielded solution is theoretically guaranteed with Pareto optimality. We also analyze the convergence rate of our framework under various control functions. Extensive experiments on two benchmark datasets across three mainstream I2I models demonstrate the effectiveness of our controllable unlearning framework.
