Table of Contents
Fetching ...

Clio: Privacy-Preserving Insights into Real-World AI Use

Alex Tamkin, Miles McCain, Kunal Handa, Esin Durmus, Liane Lovitt, Ankur Rathi, Saffron Huang, Alfred Mountfield, Jerry Hong, Stuart Ritchie, Michael Stern, Brian Clarke, Landon Goldberg, Theodore R. Sumers, Jared Mueller, William McEachen, Wes Mitchell, Shan Carter, Jack Clark, Jared Kaplan, Deep Ganguli

TL;DR

Clio addresses the lack of public data on real-world AI usage due to privacy and scalability concerns. It uses a privacy-preserving pipeline where AI assistants themselves surface aggregated usage patterns from millions of conversations, without exposing private data. The paper demonstrates Clio’s ability to reveal dominant use cases, multilingual variation, and coordinated abuse, and shows how these insights can strengthen safety classifiers and monitoring during high-stakes events. It also discusses limitations, ethical considerations, and governance implications of empirical AI usage analysis. Overall, Clio provides a scalable method for empirical AI governance and safety research with a strong privacy focus.

Abstract

How are AI assistants being used in the real world? While model providers in theory have a window into this impact via their users' data, both privacy concerns and practical challenges have made analyzing this data difficult. To address these issues, we present Clio (Claude insights and observations), a privacy-preserving platform that uses AI assistants themselves to analyze and surface aggregated usage patterns across millions of conversations, without the need for human reviewers to read raw conversations. We validate this can be done with a high degree of accuracy and privacy by conducting extensive evaluations. We demonstrate Clio's usefulness in two broad ways. First, we share insights about how models are being used in the real world from one million Claude.ai Free and Pro conversations, ranging from providing advice on hairstyles to providing guidance on Git operations and concepts. We also identify the most common high-level use cases on Claude.ai (coding, writing, and research tasks) as well as patterns that differ across languages (e.g., conversations in Japanese discuss elder care and aging populations at higher-than-typical rates). Second, we use Clio to make our systems safer by identifying coordinated attempts to abuse our systems, monitoring for unknown unknowns during critical periods like launches of new capabilities or major world events, and improving our existing monitoring systems. We also discuss the limitations of our approach, as well as risks and ethical concerns. By enabling analysis of real-world AI usage, Clio provides a scalable platform for empirically grounded AI safety and governance.

Clio: Privacy-Preserving Insights into Real-World AI Use

TL;DR

Clio addresses the lack of public data on real-world AI usage due to privacy and scalability concerns. It uses a privacy-preserving pipeline where AI assistants themselves surface aggregated usage patterns from millions of conversations, without exposing private data. The paper demonstrates Clio’s ability to reveal dominant use cases, multilingual variation, and coordinated abuse, and shows how these insights can strengthen safety classifiers and monitoring during high-stakes events. It also discusses limitations, ethical considerations, and governance implications of empirical AI usage analysis. Overall, Clio provides a scalable method for empirical AI governance and safety research with a strong privacy focus.

Abstract

How are AI assistants being used in the real world? While model providers in theory have a window into this impact via their users' data, both privacy concerns and practical challenges have made analyzing this data difficult. To address these issues, we present Clio (Claude insights and observations), a privacy-preserving platform that uses AI assistants themselves to analyze and surface aggregated usage patterns across millions of conversations, without the need for human reviewers to read raw conversations. We validate this can be done with a high degree of accuracy and privacy by conducting extensive evaluations. We demonstrate Clio's usefulness in two broad ways. First, we share insights about how models are being used in the real world from one million Claude.ai Free and Pro conversations, ranging from providing advice on hairstyles to providing guidance on Git operations and concepts. We also identify the most common high-level use cases on Claude.ai (coding, writing, and research tasks) as well as patterns that differ across languages (e.g., conversations in Japanese discuss elder care and aging populations at higher-than-typical rates). Second, we use Clio to make our systems safer by identifying coordinated attempts to abuse our systems, monitoring for unknown unknowns during critical periods like launches of new capabilities or major world events, and improving our existing monitoring systems. We also discuss the limitations of our approach, as well as risks and ethical concerns. By enabling analysis of real-world AI usage, Clio provides a scalable platform for empirically grounded AI safety and governance.

Paper Structure

This paper contains 72 sections, 15 figures, 6 tables.

Figures (15)

  • Figure 1: Using Clio to understand real-world use of AI assistants. Clio transforms raw conversations into high-level patterns and insights. This approach enables us to understand how AI assistants are being used in practice—analogous to how Google Trends provides insights about web search behavior. See \ref{['fig:system']} for more details on how Clio works and how it preserves privacy. (Note: figure contains illustrative conversation examples only.)
  • Figure 2: System diagram. This diagram illustrates how Clio processes insights from a sample of real-world conversations while maintaining user privacy. Clio processes a raw sample of traffic, extracts key facets (attributes like language or conversation topic), groups these facets into similar clusters (using text embeddings and k-means), and finally organizes those clusters into both a hierarchy as well as 2D space for ease of exploration. Along the way, Clio applies several privacy barriers (orange stripes) that prevent private information from reaching the user-visible parts of Clio (right). See \ref{['sec:system']} for more details on each stage of the pipeline. (Note: figure contains illustrative examples only.)
  • Figure 3: A screenshot of the Clio interface displaying data from the public WildChat dataset zhao2024wildchat. Left: a sidebar showing hierarchical clusters for the facet What task is the AI assistant in the conversation asked to perform? Right: a zoomable map view displaying clusters projected onto two dimensions, along with selected cluster titles. Colors can indicate various attributes of the data, including size, growth rate, and safety classifier scores. The map view makes it easy to understand the contents of the dataset at a broad and deep level, as well as discover concerning clusters and action them for further investigation. Clio's tree view (\ref{['fig:clio-interface-tree']}) is a complementary interface that offers easy navigation across Clio's learned hierarchy of concepts.
  • Figure 4: Clio reconstructs ground-truth categories on an evaluation dataset of 19,476 synthetic chat transcripts with 94% accuracy, compared to 5% for random guessing. A multilingual dataset of chat transcripts was generated by a hierarchical process starting from high-level categories $\to$ low-level categories $\to$ individual chat transcripts. Clio is evaluated on how well it can generate low-level clusters from the raw transcripts and then assign them to the correct high-level category. The plot demonstrates a high degree of alignment between the reconstructed and original data distributions. See \ref{['sec:validation']} for additional experiments and methodological details.
  • Figure 5: Clio's multiple layers of privacy interventions. Progression of privacy scores (see \ref{['tab:privacy-scale']}) across different layers in Clio for an analysis of 5,000 Claude.ai conversations. At the point where Clio's outputs are visible to analysts (Cluster Summaries) the amount of private information (1s and 2s, shaded region) reaches very low levels. For more information about the data and our policies, see \ref{['tab:experiment-details']} and \ref{['sec:accesspolicies']}.
  • ...and 10 more figures