Table of Contents
Fetching ...

Bridging the Gap in the Responsible AI Divides

Bálint Gyevnár, Atoosa Kasirzadeh

Abstract

Tensions between AI Safety (AIS) and AI Ethics (AIE) have increasingly surfaced in AI governance and public debates about AI, leading to what we term the "responsible AI divides". We introduce a model that categorizes four modes of engagement with the tensions: radical confrontation, disengagement, compartmentalized coexistence, and critical bridging. We then investigate how critical bridging, with a particular focus on bridging problems, offers one of the most viable constructive paths for advancing responsible AI. Using computational tools to analyze a curated dataset of 3,550 papers, we map the research landscapes of AIE and AIS to identify both distinct and overlapping problems. Our findings point to both thematic divides and overlaps. For example, we find that AIE has long grappled with overcoming injustice and tangible AI harms, whereas AIS has primarily embodied an anticipatory approach focused on the mitigation of risks from AI capabilities. At the same time, we find significant overlap in core research concerns across both AIE and AIS around transparency, reproducibility, and inadequate governance mechanisms. As AIE and AIS continue to evolve, we recommend focusing on bridging problems as a constructive path forward for enhancing collaborative AI governance. We offer a series of recommendations to integrate shared considerations into a collaborative approach to responsible AI. Alongside our proposal, we highlight its limitations and explore open problems for future research. All data including the fully annotated dataset of papers with code to reproduce our figures can be found at: https://github.com/gyevnarb/ai-safety-ethics.

Bridging the Gap in the Responsible AI Divides

Abstract

Tensions between AI Safety (AIS) and AI Ethics (AIE) have increasingly surfaced in AI governance and public debates about AI, leading to what we term the "responsible AI divides". We introduce a model that categorizes four modes of engagement with the tensions: radical confrontation, disengagement, compartmentalized coexistence, and critical bridging. We then investigate how critical bridging, with a particular focus on bridging problems, offers one of the most viable constructive paths for advancing responsible AI. Using computational tools to analyze a curated dataset of 3,550 papers, we map the research landscapes of AIE and AIS to identify both distinct and overlapping problems. Our findings point to both thematic divides and overlaps. For example, we find that AIE has long grappled with overcoming injustice and tangible AI harms, whereas AIS has primarily embodied an anticipatory approach focused on the mitigation of risks from AI capabilities. At the same time, we find significant overlap in core research concerns across both AIE and AIS around transparency, reproducibility, and inadequate governance mechanisms. As AIE and AIS continue to evolve, we recommend focusing on bridging problems as a constructive path forward for enhancing collaborative AI governance. We offer a series of recommendations to integrate shared considerations into a collaborative approach to responsible AI. Alongside our proposal, we highlight its limitations and explore open problems for future research. All data including the fully annotated dataset of papers with code to reproduce our figures can be found at: https://github.com/gyevnarb/ai-safety-ethics.
Paper Structure (21 sections, 13 figures, 12 tables)

This paper contains 21 sections, 13 figures, 12 tables.

Figures (13)

  • Figure 1: Four resolution modes for responsible AI divides. The modes differ in the extent to which they directly engage tensions between AI ethics (AIE) and AI safety (AIS), and in whether such engagement produces constructive integration or instead reproduces division.
  • Figure 2: Two ways of showing distinctive words in the corpora. (Top) The most frequent words in one corpus that occur the least number of times in the other corpus. (Bottom) The log-odds ratio of each word plotted against their occurrence count. Higher positive ratio means the terms are more likely to occur in the AI ethics corpus, while lower negative ratio means the term is more likely to occur in the AI safety corpus. Red colors correspond to AI safety terms, blue colors to AI ethics terms.
  • Figure 3: Cosine similarity between corpora over the years. The blue line shows the average cosine similarity between embeddings of each pair of documents from each corpus for a given year. The orange line shows the average cosine similarity between the embeddings of each ethics document and the overall average embedding of the entire safety corpus. The green line shows the average cosine similarity between the embeddings of each safety document and the overall average embedding of the entire ethics corpus.
  • Figure 4: Bar plots showing the number of papers assigned to each high-level risk category in the AIE corpus (Left) and the AIS corpus (Right). Each paper was assigned to up to three distinct categories.
  • Figure 5: Bar plots of category similarities showing the top-seven most frequent low-level risk categories where the source corpus of the paper does not match the taxonomic corpus-type of the annotated category (e.g. a paper from the AIE corpus was annotated with a category from the AIS taxonomy).
  • ...and 8 more figures