Table of Contents
Fetching ...

Actions Speak Louder Than Chats: Investigating AI Chatbot Age Gating

Olivia Figueira, Pranathi Chamarthi, Tu Le, Athina Markopoulou

TL;DR

The paper addresses how AI chatbots identify and gate child users in COPPA-like contexts. It introduces a novel auditing framework and a comprehensive age-indicative prompt library, applying 1050 experiments across five major chatbots to assess age estimation and gating actions. The results show chatbots can estimate age from chats, but they fail to take blocking or guardian-notification actions, revealing a gap between policies and practice. The work provides a proof-of-concept age gating design and regulatory recommendations, offering a baseline for future audits and policy development to better protect children online.

Abstract

AI chatbots are widely used by children and teens today, but they pose significant risks to youth's privacy and safety due to both increasingly personal conversations and potential exposure to unsafe content. While children under 13 are protected by the Children's Online Privacy Protection Act (COPPA), chatbot providers' own privacy policies may also provide protections, since they typically prohibit children from accessing their platforms. Age gating is often employed to restrict children online, but chatbot age gating in particular has not been studied. In this paper, we investigate whether popular consumer chatbots are (i) able to estimate users' ages based solely on their conversations, and (ii) whether they take action upon identifying children. To that end, we develop an auditing framework in which we programmatically interact with chatbots and conduct 1050 experiments using our comprehensive library of age-indicative prompts, including implicit and explicit age disclosures, to analyze the chatbots' responses and actions. We find that while chatbots are capable of estimating age, they do not take any action when children are identified, contradicting their own policies. Our methodology and findings provide insights for platform design, demonstrated by our proof-of-concept chatbot age gating implementation, and regulation to protect children online.

Actions Speak Louder Than Chats: Investigating AI Chatbot Age Gating

TL;DR

The paper addresses how AI chatbots identify and gate child users in COPPA-like contexts. It introduces a novel auditing framework and a comprehensive age-indicative prompt library, applying 1050 experiments across five major chatbots to assess age estimation and gating actions. The results show chatbots can estimate age from chats, but they fail to take blocking or guardian-notification actions, revealing a gap between policies and practice. The work provides a proof-of-concept age gating design and regulatory recommendations, offering a baseline for future audits and policy development to better protect children online.

Abstract

AI chatbots are widely used by children and teens today, but they pose significant risks to youth's privacy and safety due to both increasingly personal conversations and potential exposure to unsafe content. While children under 13 are protected by the Children's Online Privacy Protection Act (COPPA), chatbot providers' own privacy policies may also provide protections, since they typically prohibit children from accessing their platforms. Age gating is often employed to restrict children online, but chatbot age gating in particular has not been studied. In this paper, we investigate whether popular consumer chatbots are (i) able to estimate users' ages based solely on their conversations, and (ii) whether they take action upon identifying children. To that end, we develop an auditing framework in which we programmatically interact with chatbots and conduct 1050 experiments using our comprehensive library of age-indicative prompts, including implicit and explicit age disclosures, to analyze the chatbots' responses and actions. We find that while chatbots are capable of estimating age, they do not take any action when children are identified, contradicting their own policies. Our methodology and findings provide insights for platform design, demonstrated by our proof-of-concept chatbot age gating implementation, and regulation to protect children online.
Paper Structure (38 sections, 7 figures, 7 tables)

This paper contains 38 sections, 7 figures, 7 tables.

Figures (7)

  • Figure 1: AI Chatbot Age Gating Auditing Methodology Overview. This figure presents an overview of our auditing methodology, discussed in Section \ref{['sec:method']}. (1) Experimental Inputs include our Age Gating Prompt Library, Age Estimation Check, and Action Check. The Experimental Inputs are used in our (2) AI Chatbot Auditor, which includes our Experiment and Interaction Automator, that programmatically conducts experiments with and collects responses from chatbot web interfaces, resulting in our AI Chatbot Age Gating Dataset. We conduct (3) Post-Processing, which includes manual and automated LLM-based labeling, resulting in our Labeled AI Chatbot Age Gating Dataset. The (4) Analysis of our dataset focuses on age estimation performance, chatbot actions, chatbot response styles, knowledge accumulation, and privacy policies.
  • Figure 2: Example of Chat Log for Age-9 Individual Experiment with ChatGPT. This figure provides the chat log for an age-9 ChatGPT individual experiment. The right side shows our inputs, the left side shows ChatGPT's responses, and the arrows indicate our age estimation/action checks.
  • Figure 3: Example of Chat Log for Age-7 Sequential Experiment with Meta AI. This figure provides an excerpt of the chat log for an age-7 Meta AI sequential experiment. The right side shows our alternating inputs and age estimation checks (indicated by arrows), and the left side shows Meta AI's responses.
  • Figure 4: Chat Log Excerpt for Age-5 Individual Experiment with ChatGPT. This figure provides the chat log excerpt for an age-5 individual experiment with ChatGPT. The right side shows our inputs, and the left side shows ChatGPT's responses. ChatGPT acknowledges the user's disclosed age (i.e., 5) but later insists the user is 16 years old instead.
  • Figure 5: Knowledge Accumulation in Sequential Experiments. These figures visualize the chatbots' knowledge accumulation per chatbot for the age-5 experiments (Figure \ref{['fig:knowledge_accumulation_seq_age5']}) and per age group across chatbots (Figure \ref{['fig:knowledge_accumulation_seq_aggregated']}). The x-axes represent the sequence of responses to the age estimation checks, and the y-axes represent the age estimated, which are averaged across experiments. Child experiments include one more exchange than teen/adult experiments. Explicit prompts begin at the seventh age estimation check. All chatbots are able to accurately estimate age, but it takes several exchanges for the child age groups.
  • ...and 2 more figures