Table of Contents
Fetching ...

DNNs May Determine Major Properties of Their Outputs Early, with Timing Possibly Driven by Bias

Song Park, Sanghyuk Chun, Byeongho Heo, Dongyoon Han

TL;DR

The paper investigates whether deep neural networks fix major output properties early in inference and whether this timing is driven by inherent biases acting as fast heuristics. Using diffusion models as an analyzable, iterative testbed, it perturb mid-generation prompts and measure CLIP-based switching between initial and altered cues to identify when outputs become determined. Across five diffusion models and two attribute scenarios (common objects and human attributes), the results show that outputs are often fixed in early diffusion steps, with the timing strongly modulated by the strength and type of attribute bias; color cues tend to tighten early determinations, while more complex cues like material require more steps. The findings offer a new lens on bias mitigation and efficient inference, suggesting that understanding and controlling early-determination dynamics could improve robustness and interpretability, while highlighting ethical considerations in applying bias insights to real-world systems and proposing avenues for inducing more deliberative processing in generative models.

Abstract

This paper argues that deep neural networks (DNNs) mostly determine their outputs during the early stages of inference, where biases inherent in the model play a crucial role in shaping this process. We draw a parallel between this phenomenon and human decision-making, which often relies on fast, intuitive heuristics. Using diffusion models (DMs) as a case study, we demonstrate that DNNs often make early-stage decision-making influenced by the type and extent of bias in their design and training. Our findings offer a new perspective on bias mitigation, efficient inference, and the interpretation of machine learning systems. By identifying the temporal dynamics of decision-making in DNNs, this paper aims to inspire further discussion and research within the machine learning community.

DNNs May Determine Major Properties of Their Outputs Early, with Timing Possibly Driven by Bias

TL;DR

The paper investigates whether deep neural networks fix major output properties early in inference and whether this timing is driven by inherent biases acting as fast heuristics. Using diffusion models as an analyzable, iterative testbed, it perturb mid-generation prompts and measure CLIP-based switching between initial and altered cues to identify when outputs become determined. Across five diffusion models and two attribute scenarios (common objects and human attributes), the results show that outputs are often fixed in early diffusion steps, with the timing strongly modulated by the strength and type of attribute bias; color cues tend to tighten early determinations, while more complex cues like material require more steps. The findings offer a new lens on bias mitigation and efficient inference, suggesting that understanding and controlling early-determination dynamics could improve robustness and interpretability, while highlighting ethical considerations in applying bias insights to real-world systems and proposing avenues for inducing more deliberative processing in generative models.

Abstract

This paper argues that deep neural networks (DNNs) mostly determine their outputs during the early stages of inference, where biases inherent in the model play a crucial role in shaping this process. We draw a parallel between this phenomenon and human decision-making, which often relies on fast, intuitive heuristics. Using diffusion models (DMs) as a case study, we demonstrate that DNNs often make early-stage decision-making influenced by the type and extent of bias in their design and training. Our findings offer a new perspective on bias mitigation, efficient inference, and the interpretation of machine learning systems. By identifying the temporal dynamics of decision-making in DNNs, this paper aims to inspire further discussion and research within the machine learning community.

Paper Structure

This paper contains 27 sections, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Overview of the proposed framework. We choose two prompts (initial prompt $c_i$ and altered prompt $c_a$) formatting "A photo of [attribute][entity]", where two prompts have the same [entity] but different [attribute]. At the timestamp $t_s$, we alter the initial text condition $c_i$ to the new condition $c_a$. Then, we measure the impact of each prompt using the CLIP similarity between the generated image and text prompts. We can observe that there exists a "switching point" where the generated image is influenced more to $c_a$ rather than $c_i$ (e.g., 9 for the apple example and 15 for the backpack example). Different attributes show different switching points, whereas a more biased attribute has an earlier one (e.g., the left color example shows an earlier conversion than the right pattern example).
  • Figure 2: The inference process of DMs (i.e., reverse process) is tractable over each intermediate output and each step is controllable by a flexible and human-understandable text prompt. We examine the temporal dynamics of inference using this iterative inference process.
  • Figure 3: We show examples of $x^{t_s}$ by varying $t_s$ from $0$ to $T$ and their estimated CLIP scores. The x-axis denotes $t_s$, the timestamp where the initial prompt $c_i$ is changed to the altered prompt $c_a$. The y-axis denotes the ratio of $S(x^{t_s}, c_i)$ and $S(x^{t_s}, c_a)$; higher means the generated image is more influenced by $c_a$ and vice versa. When will the generated image be more influenced by $c_a$ than $c_i$? If the output is more influenced by $c_i$, then we need a smaller $t_s$ to make the image more influenced by $c_i$ (e.g., the $t_1$ case). Otherwise, a larger $t_s$ will be sufficient (e.g., the $t_3$ case).
  • Figure 4: DNNs may determine the major properties of their output at an early stage. We plot the average and the standard error of $S(x^{t_s}, c_i)$ and $S(x^{t_s}, c_a)$. $S(x^{t_s}, c)$ denotes a CLIP similarity between a text prompt $c$ and a generated image $x^t_s$ by altering the initial prompt $c_i$ to altered prompt $c_a$ at timestamp $t_s$. $x^{0}$ equals to an image fully conditioned by $c_a$ and $x^{50}$ equals to one conditioned by $c_i$ (\ref{['fig:switching_point_example']} shows an example). Each point is computed with 50 samples (10 attribute pairs and 5 random seeds). The red line is the "switching point", the smallest $t_s'$ where $S(x^{t_s'}, c_a) > S(x^{t_s'}, c_i)$ on average, which is a proxy of the timing of the "determination".
  • Figure 5: Cumulative histogram of the sample-wise switching timing for each model and attribute. Note that we have ten objects, ten attribute pairs, and five random seeds; hence, each histogram contains 500 samples.
  • ...and 13 more figures