Table of Contents
Fetching ...

Network Traffic as a Scalable Ethnographic Lens for Understanding University Students' AI Tool Practices

Donghan Hu, Rameen Mahmood, Annabelle David, Danny Yuxing Huang

TL;DR

Ethnography faces a tension between rich contextual insight and scalable observation. The authors repurpose privacy-preserving VPN-based network traffic analysis to passively trace university students' engagement with generative AI tools over a three-week field study, complemented by surveys and interviews. They find AI use to be fragmented into short bursts across multiple devices, with activity surging during exam periods, revealing rhythms tied to academic life. The work contributes a scalable, privacy-conscious digital-ethnography method, demonstrates its feasibility, and discusses ethical, methodological, and empirical implications for studying technology use in-the-wild.

Abstract

AI-driven applications have become woven into students' academic and creative workflows, influencing how they learn, write, and produce ideas. Gaining a nuanced understanding of these usage patterns is essential, yet conventional survey and interview methods remain limited by recall bias, self-presentation effects, and the underreporting of habitual behaviors. While ethnographic methods offer richer contextual insights, they often face challenges of scale and reproducibility. To bridge this gap, we introduce a privacy-conscious approach that repurposes VPN-based network traffic analysis as a scalable ethnographic technique for examining students' real-world engagement with AI tools. By capturing anonymized metadata rather than content, this method enables fine-grained behavioral tracing while safeguarding personal information, thereby complementing self-report data. A three-week field deployment with university students reveals fragmented, short-duration interactions across multiple tools and devices, with intense bursts of activity coinciding with exam periods-patterns mirroring institutional rhythms of academic life. We conclude by discussing methodological, ethical, and empirical implications, positioning network traffic analysis as a promising avenue for large-scale digital ethnography on technology-in-practice.

Network Traffic as a Scalable Ethnographic Lens for Understanding University Students' AI Tool Practices

TL;DR

Ethnography faces a tension between rich contextual insight and scalable observation. The authors repurpose privacy-preserving VPN-based network traffic analysis to passively trace university students' engagement with generative AI tools over a three-week field study, complemented by surveys and interviews. They find AI use to be fragmented into short bursts across multiple devices, with activity surging during exam periods, revealing rhythms tied to academic life. The work contributes a scalable, privacy-conscious digital-ethnography method, demonstrates its feasibility, and discusses ethical, methodological, and empirical implications for studying technology use in-the-wild.

Abstract

AI-driven applications have become woven into students' academic and creative workflows, influencing how they learn, write, and produce ideas. Gaining a nuanced understanding of these usage patterns is essential, yet conventional survey and interview methods remain limited by recall bias, self-presentation effects, and the underreporting of habitual behaviors. While ethnographic methods offer richer contextual insights, they often face challenges of scale and reproducibility. To bridge this gap, we introduce a privacy-conscious approach that repurposes VPN-based network traffic analysis as a scalable ethnographic technique for examining students' real-world engagement with AI tools. By capturing anonymized metadata rather than content, this method enables fine-grained behavioral tracing while safeguarding personal information, thereby complementing self-report data. A three-week field deployment with university students reveals fragmented, short-duration interactions across multiple tools and devices, with intense bursts of activity coinciding with exam periods-patterns mirroring institutional rhythms of academic life. We conclude by discussing methodological, ethical, and empirical implications, positioning network traffic analysis as a promising avenue for large-scale digital ethnography on technology-in-practice.

Paper Structure

This paper contains 27 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: Participants install the official WireGuard client, scan a study-specific QR code generated by the portal, and the device is configured to route traffic through the research server.
  • Figure 2: Schematic of the data path, illustrating that anonymous network traffic data is stored while application content remains unexamined.
  • Figure 3: Overview of the three-week deployment with portal status checks, continuous VPN logging on participant devices, and post-study survey/interview.
  • Figure 4: Heatmap of participant-level AI activity over time. Each row corresponds to a single device, and each column to an hourly bin across the deployment. Color intensity reflects normalized traffic volume (row-wise $z$-scores), highlighting relative fluctuations while enabling comparison across participants.
  • Figure 5: Distribution of session durations in minutes for six generative AI tools. Each histogram shows a long-tail distribution, with a high concentration of sessions lasting less than 10 minutes.
  • ...and 5 more figures