DREAM: Domain-agnostic Reverse Engineering Attributes of Black-box Model

Rongqing Li; Jiaqi Yu; Changsheng Li; Wenhan Luo; Ye Yuan; Guoren Wang

DREAM: Domain-agnostic Reverse Engineering Attributes of Black-box Model

Rongqing Li, Jiaqi Yu, Changsheng Li, Wenhan Luo, Ye Yuan, Guoren Wang

TL;DR

This work addresses the practical challenge of reverse engineering attributes of black-box models without access to the target model's training data. It introduces DREAM, a framework that reframes attribute inference as an OOD generalization problem and leverages a multi-discriminator GAN to learn domain-invariant features from probability outputs, followed by a domain-agnostic reverse meta-model to predict attributes. Empirical results on PACS and MEDU modelsets show DREAM outperforms baselines across CNN and ViT attribute spaces, including domain-shift and larger attribute scenarios, and demonstrate the approach's potential for model extraction and security analyses. The study highlights both the feasibility of domain-agnostic reverse engineering in practical MLaaS settings and the need to consider defenses against attribute leakage.

Abstract

Deep learning models are usually black boxes when deployed on machine learning platforms. Prior works have shown that the attributes (e.g., the number of convolutional layers) of a target black-box model can be exposed through a sequence of queries. There is a crucial limitation: these works assume the training dataset of the target model is known beforehand and leverage this dataset for model attribute attack. However, it is difficult to access the training dataset of the target black-box model in reality. Therefore, whether the attributes of a target black-box model could be still revealed in this case is doubtful. In this paper, we investigate a new problem of black-box reverse engineering, without requiring the availability of the target model's training dataset. We put forward a general and principled framework DREAM, by casting this problem as out-of-distribution (OOD) generalization. In this way, we can learn a domain-agnostic meta-model to infer the attributes of the target black-box model with unknown training data. This makes our method one of the kinds that can gracefully apply to an arbitrary domain for model attribute reverse engineering with strong generalization ability. Extensive experimental results demonstrate the superiority of our proposed method over the baselines.

DREAM: Domain-agnostic Reverse Engineering Attributes of Black-box Model

TL;DR

Abstract

DREAM: Domain-agnostic Reverse Engineering Attributes of Black-box Model

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)