Table of Contents
Fetching ...

Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice

Karan Shah, Julie Butler, Alexis Knaub, Anıl Zenginoğlu, William Ratcliff, Mohammad Soltanieh-ha

TL;DR

This work addresses the gap in undergraduate physics education where data science skills are increasingly essential but rarely taught. It presents the Data Science Education Community of Practice (DSECOP) and its modular, browser-based COP modules that integrate DS concepts with physics content, enabling incremental adoption in existing courses. Through faculty and industry surveys, the authors identify prevailing barriers and demonstrate how COP’s open-source modules, governance, and workshops offer practical pathways to overcome them. The initiative promises to modernize physics education by equipping students with data-analysis, visualization, and computation skills that are vital for research and industry in a data-driven world.

Abstract

It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.

Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice

TL;DR

This work addresses the gap in undergraduate physics education where data science skills are increasingly essential but rarely taught. It presents the Data Science Education Community of Practice (DSECOP) and its modular, browser-based COP modules that integrate DS concepts with physics content, enabling incremental adoption in existing courses. Through faculty and industry surveys, the authors identify prevailing barriers and demonstrate how COP’s open-source modules, governance, and workshops offer practical pathways to overcome them. The initiative promises to modernize physics education by equipping students with data-analysis, visualization, and computation skills that are vital for research and industry in a data-driven world.

Abstract

It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
Paper Structure (11 sections, 4 figures, 2 tables)

This paper contains 11 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Percentage of academic papers per year uploaded to arXiv's physics section that mention "machine learning" anywhere in the article. The red star represents the inception of the American Physical Society's Topical Group on Data Science (GDS) in 2018.
  • Figure 2: Results of data skills survey for faculty who teach data science in their intermediate/advanced physics courses. Relevant skills are listed on the x-axis, with the different textures showing the proportion of responses received.
  • Figure 3: Results of data skills survey for industry practitioners. Relevant skills are listed on the x-axis, with the different textures showing the proportion of responses received.
  • Figure 4: This flowchart shows the relationships between the modules and their topic (general data science or machine learning). The arrows depict suggested prerequisites for the relevant data science and machine learning topics, starting from the most basic module COP 101: Introduction to Data Science Libraries. Modules that cover data science and machine learning at the introductory level are shown with solid borders and are numbered COP 1XX. Modules with intermediate topics have dashed borders and are numbered COP 2XX and modules that cover advanced topics in data science and machine learning have dotted borders and are numbered COP 3XX.