Table of Contents
Fetching ...

The Impact of the MAST Data Archive

Richard A. Shaw, Jenny L. Novacescu, Sarah Weissman, Travis A. Berger, Clara E. Brasseur, Jeff Chamblee, Brian Cherinka, Zachary R. Claytor, Theresa Dower, Chinwe Edeani, Scott W. Fleming, Jonathan R. Hargis, Julie Imig, Tim Januario, Karen Levay, Tim Kimball, Jenn Kotler, Hannah M. Lewis, Steve Lubow, Adrian Lucy, Brian McLean, Sunita G. Malla, Jacob Matuskey, Sophie J. Miller, Susan E. Mullally, Claire E. Murray, J. E. G. Peek, Carlita Phillip, Marc Rafelski, David R. Rodriguez, Gregory F. Snyder, Achu J. Usha, Richard L. White, Jinmi Yoon

TL;DR

This bibliometric study quantifies the productivity and impact of data hosted by the Barbara A. Mikulski Archive for Space Telescopes (MAST) over ~50 years, using ADS-derived mission bibliographies and a structured data-usage taxonomy. The analysis shows a high and enduring publication rate across both flagship and non-flagship missions, with a strong long-tail of citations and a rapid rise in JWST-related work. Archival research dominates HST output and is increasingly prominent for JWST, while cross-mission data reuse and high-level science products (HLSPs) amplify impact. The authors discuss limitations of publication metrics, the influence of funding and infrastructure, and the potential of AI-assisted bibliography maintenance to improve accuracy and cross-mission comparability, outlining a roadmap for maintaining and expanding MAST’s bibliographic footprint into future missions.

Abstract

The Barbara A. Mikulski Archive for Space Telescopes (MAST) hosts science-ready data products from over twenty NASA missions, plus community-contributed data collections, and other select surveys. The data support forefront research in the ultraviolet, optical, and near-infrared wavelength bands. We have constructed bibliographies for each mission from publications in nearly 40 professional journals, and have identified more than 37,000 refereed articles where investigators made a science usage of data hosted in MAST. The publication rate over the last 50 years shows that most MAST missions have had very high productivity during their in-service lifetimes, and have remained so for years or decades afterward. Annual citations to these publications, a measure of impact on research, are robust for most missions, with citations that grow over more than a decade. Most of the citations come from about 10% of articles within each mission. We examined the bibliographies of the active missions HST and JWST in greater detail. For HST the rate of archival publications exceeded those authored by the original observing teams within a decade of launch, and is now more than 3 times higher. Early indications hint that JWST archival articles could dominate the publication rate even sooner. The production of articles resulting from any given observing program can extend for decades. Programs with small and very large allocations of observing time tend to be particularly productive per unit of observing time. For HST in general, a first publication appears within 1.5 yr for 50% of observing programs, and within 3.8 yr for 80% of programs. We discuss various external factors that affect publication metrics, their strengths and limitations for measuring scientific impact, and the challenges of making meaningful comparisons of publication metrics across missions.

The Impact of the MAST Data Archive

TL;DR

This bibliometric study quantifies the productivity and impact of data hosted by the Barbara A. Mikulski Archive for Space Telescopes (MAST) over ~50 years, using ADS-derived mission bibliographies and a structured data-usage taxonomy. The analysis shows a high and enduring publication rate across both flagship and non-flagship missions, with a strong long-tail of citations and a rapid rise in JWST-related work. Archival research dominates HST output and is increasingly prominent for JWST, while cross-mission data reuse and high-level science products (HLSPs) amplify impact. The authors discuss limitations of publication metrics, the influence of funding and infrastructure, and the potential of AI-assisted bibliography maintenance to improve accuracy and cross-mission comparability, outlining a roadmap for maintaining and expanding MAST’s bibliographic footprint into future missions.

Abstract

The Barbara A. Mikulski Archive for Space Telescopes (MAST) hosts science-ready data products from over twenty NASA missions, plus community-contributed data collections, and other select surveys. The data support forefront research in the ultraviolet, optical, and near-infrared wavelength bands. We have constructed bibliographies for each mission from publications in nearly 40 professional journals, and have identified more than 37,000 refereed articles where investigators made a science usage of data hosted in MAST. The publication rate over the last 50 years shows that most MAST missions have had very high productivity during their in-service lifetimes, and have remained so for years or decades afterward. Annual citations to these publications, a measure of impact on research, are robust for most missions, with citations that grow over more than a decade. Most of the citations come from about 10% of articles within each mission. We examined the bibliographies of the active missions HST and JWST in greater detail. For HST the rate of archival publications exceeded those authored by the original observing teams within a decade of launch, and is now more than 3 times higher. Early indications hint that JWST archival articles could dominate the publication rate even sooner. The production of articles resulting from any given observing program can extend for decades. Programs with small and very large allocations of observing time tend to be particularly productive per unit of observing time. For HST in general, a first publication appears within 1.5 yr for 50% of observing programs, and within 3.8 yr for 80% of programs. We discuss various external factors that affect publication metrics, their strengths and limitations for measuring scientific impact, and the challenges of making meaningful comparisons of publication metrics across missions.

Paper Structure

This paper contains 36 sections, 14 figures.

Figures (14)

  • Figure 1: Fraction of articles from refereed journals where genuine mission identifiers were encountered during the period 2018 Jan and 2023 May. The categories for ApJ and A&A both include the main journal, Letters, and the Supplements. The four journals in the axis labels account for more than 80% of all papers in which a MAST mission name appeared.
  • Figure 2: Number of papers per year where one or more MAST mission names or keywords appear. Also shown is the growth in the number of refereed papers published per year in the four main journals noted in Figure \ref{['fig:papersByJournal']}.
  • Figure 3: Histogram of data usage classification fractions that were assigned after 2018 Jan 01. See text for classification criteria.
  • Figure 4: Histogram of deviations of paper classifications (colored bars) by team members (designated by initials along the $x-$axis) from the team consensus view of non-Flagship mission papers. The deviations decreased markedly from the first round of re-review (left) to the third round (right). The deviations should be interpreted something like: "Reviewer X assigned Mention 5.1% more often than the consensus view, and assigned Science 2.1% less."
  • Figure 5: Refereed publications per year where data from a MAST-hosted mission was used in the analysis. Curves are color-coded per mission (see legend and Appendix \ref{['sec:mastMissions']}); most mission statistics are complete only through 2022. The total of MAST missions called-out in these papers (exclusive of the Flagships: HST, JWST) is also shown (thick blue curve). The spans of mission in-service dates are indicated (colored bars).
  • ...and 9 more figures