The ethical situation of DALL-E 2

Eduard Hogea; Josem Rocafortf

The ethical situation of DALL-E 2

Eduard Hogea, Josem Rocafortf

TL;DR

The paper analyzes the ethical implications of DALL-E 2, a leading AI image generation system. It applies Responsible Research and Innovation (RRI) principles and a technology-society lens to assess capabilities, potential misuse, governance needs, and environmental considerations. It combines architectural insights (CLIP-based prompts and diffusion) with social dynamics to propose embedded values, governance frameworks, and non-polarized public discourse as mitigation strategies. The work provides a framework for responsibly introducing powerful AI image systems to society, balancing innovation with safeguards against misinformation, exploitation, and cultural harm.

Abstract

A hot topic of Artificial Intelligence right now is image generation from prompts. DALL-E 2 is one of the biggest names in this domain, as it allows people to create images from simple text inputs, to even more complicated ones. The company that made this possible, OpenAI, has assured everyone that visited their website that their mission is to ensure that artificial general intelligence benefits all humanity. A noble idea in our opinion, that also stood as the motive behind us choosing this subject. This paper analyzes the ethical implications of an AI image generative system, with an emphasis on how society is responding to it, how it probably will and how it should if all the right measures are taken.

The ethical situation of DALL-E 2

TL;DR

Abstract

Paper Structure (8 sections, 3 figures)

This paper contains 8 sections, 3 figures.

Introduction
Understanding what can DALL-E 2 actually do
Current and potential future use of it
Following the RRI, (Responsible research innovation) principles
Technology and society, a complex relationship
Technological mediation
Summing up technological mediation
Conclusion

Figures (3)

Figure 1: Images generated by us with a similar system called Midjourney AI, with the prompts shown below each of them.
Figure 2: Images generated from the correspondent prompts with DALL-E 2.
Figure 3: Map of involved actors. The main parts of the system are shown in a flow diagram, with an actual use case where one artist can have two separate images used to create a new one based on the prompt of the user. It can be noted that in the upper part of the presented figure, there are two parts delimited by a dotted line. The upper part, is the CLIP model presented in the paper ramesh2022hierarchical, used to generate a set of visual attributes that correspond to the given textual description. The one below, is the Diffusion model that enhances the image prior generated with CLIP to produce the final image. Both of these models are using data gained from the textual input, but also from the artist's images.

The ethical situation of DALL-E 2

TL;DR

Abstract

The ethical situation of DALL-E 2

Authors

TL;DR

Abstract

Table of Contents

Figures (3)