Table of Contents
Fetching ...

Domain Bridge: Generative model-based domain forensic for black-box models

Jiyi Zhang, Han Fang, Ee-Chien Chang

TL;DR

The paper tackles the challenge of uncovering not only the broad data domain but also fine-grained attributes of unknown black-box models. It introduces Domain Bridge, an iterative, generative-search framework that couples CLIP-based embeddings with Stable Diffusion to progressively refine textual descriptions and generate images that reveal a model’s domain. By formulating a dual-objective (relevance and generality) and employing a BFS-like search with enrichment, summarization, and grouping modules, the method achieves fine-grained domain identification across CIFAR-10, Places365, CelebA attributes, and real-world Hugging Face models, often outperforming corpus-based baselines. The approach demonstrates practical utility for model transparency, bias detection, and post-hoc analysis, while acknowledging biases in generative tools and the computational cost of large multimodal models. Future directions include extending to other modalities, improving efficiency, and addressing ethical considerations in forensic usage.

Abstract

In forensic investigations of machine learning models, techniques that determine a model's data domain play an essential role, with prior work relying on large-scale corpora like ImageNet to approximate the target model's domain. Although such methods are effective in finding broad domains, they often struggle in identifying finer-grained classes within those domains. In this paper, we introduce an enhanced approach to determine not just the general data domain (e.g., human face) but also its specific attributes (e.g., wearing glasses). Our approach uses an image embedding model as the encoder and a generative model as the decoder. Beginning with a coarse-grained description, the decoder generates a set of images, which are then presented to the unknown target model. Successful classifications by the model guide the encoder to refine the description, which in turn, are used to produce a more specific set of images in the subsequent iteration. This iterative refinement narrows down the exact class of interest. A key strength of our approach lies in leveraging the expansive dataset, LAION-5B, on which the generative model Stable Diffusion is trained. This enlarges our search space beyond traditional corpora, such as ImageNet. Empirical results showcase our method's performance in identifying specific attributes of a model's input domain, paving the way for more detailed forensic analyses of deep learning models.

Domain Bridge: Generative model-based domain forensic for black-box models

TL;DR

The paper tackles the challenge of uncovering not only the broad data domain but also fine-grained attributes of unknown black-box models. It introduces Domain Bridge, an iterative, generative-search framework that couples CLIP-based embeddings with Stable Diffusion to progressively refine textual descriptions and generate images that reveal a model’s domain. By formulating a dual-objective (relevance and generality) and employing a BFS-like search with enrichment, summarization, and grouping modules, the method achieves fine-grained domain identification across CIFAR-10, Places365, CelebA attributes, and real-world Hugging Face models, often outperforming corpus-based baselines. The approach demonstrates practical utility for model transparency, bias detection, and post-hoc analysis, while acknowledging biases in generative tools and the computational cost of large multimodal models. Future directions include extending to other modalities, improving efficiency, and addressing ethical considerations in forensic usage.

Abstract

In forensic investigations of machine learning models, techniques that determine a model's data domain play an essential role, with prior work relying on large-scale corpora like ImageNet to approximate the target model's domain. Although such methods are effective in finding broad domains, they often struggle in identifying finer-grained classes within those domains. In this paper, we introduce an enhanced approach to determine not just the general data domain (e.g., human face) but also its specific attributes (e.g., wearing glasses). Our approach uses an image embedding model as the encoder and a generative model as the decoder. Beginning with a coarse-grained description, the decoder generates a set of images, which are then presented to the unknown target model. Successful classifications by the model guide the encoder to refine the description, which in turn, are used to produce a more specific set of images in the subsequent iteration. This iterative refinement narrows down the exact class of interest. A key strength of our approach lies in leveraging the expansive dataset, LAION-5B, on which the generative model Stable Diffusion is trained. This enlarges our search space beyond traditional corpora, such as ImageNet. Empirical results showcase our method's performance in identifying specific attributes of a model's input domain, paving the way for more detailed forensic analyses of deep learning models.
Paper Structure (29 sections, 1 equation, 1 figure, 4 tables)

This paper contains 29 sections, 1 equation, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Left: An overview of the components in our framework. Right: An overview of the iterative description refinement process (detailed in Section \ref{['sec:SearchAlgorithm']}).