Table of Contents
Fetching ...

VIS30K: A Collection of Figures and Tables from IEEE Visualization Conference Publications

Jian Chen, Meng Ling, Rui Li, Petra Isenberg, Tobias Isenberg, Michael Sedlmair, Torsten Möller, Robert S. Laramee, Han-Wei Shen, Katharina Wünsche, Qiru Wang

TL;DR

VIS30K tackles the lack of image-centric access to IEEE VIS literature by assembling a complete 30-year collection of figures and tables from 1990–2019 and providing a searchable browser, VIN. The authors implement a semi-automatic extraction pipeline that uses synthetic pseudo-papers to train CNN detectors (YOLOv3 and Faster R-CNN) and then applies manual curation to produce high-quality bounding-box annotations for 26,776 figures and 2,913 tables across 2,916 papers. They release VIS30K data, grounding metadata, training data, and pretrained models to support reproducible research and tool-building. The work enables visual-bibliometric analyses, teaching with image-based access, and platform-enabled related-work discovery and ML benchmarking on scholarly documents.

Abstract

We present the VIS30K dataset, a collection of 29,689 images that represents 30 years of figures and tables from each track of the IEEE Visualization conference series (Vis, SciVis, InfoVis, VAST). VIS30K's comprehensive coverage of the scientific literature in visualization not only reflects the progress of the field but also enables researchers to study the evolution of the state-of-the-art and to find relevant work based on graphical content. We describe the dataset and our semi-automatic collection process, which couples convolutional neural networks (CNN) with curation. Extracting figures and tables semi-automatically allows us to verify that no images are overlooked or extracted erroneously. To improve quality further, we engaged in a peer-search process for high-quality figures from early IEEE Visualization papers. With the resulting data, we also contribute VISImageNavigator (VIN, visimagenavigator.github.io), a web-based tool that facilitates searching and exploring VIS30K by author names, paper keywords, title and abstract, and years.

VIS30K: A Collection of Figures and Tables from IEEE Visualization Conference Publications

TL;DR

VIS30K tackles the lack of image-centric access to IEEE VIS literature by assembling a complete 30-year collection of figures and tables from 1990–2019 and providing a searchable browser, VIN. The authors implement a semi-automatic extraction pipeline that uses synthetic pseudo-papers to train CNN detectors (YOLOv3 and Faster R-CNN) and then applies manual curation to produce high-quality bounding-box annotations for 26,776 figures and 2,913 tables across 2,916 papers. They release VIS30K data, grounding metadata, training data, and pretrained models to support reproducible research and tool-building. The work enables visual-bibliometric analyses, teaching with image-based access, and platform-enabled related-work discovery and ML benchmarking on scholarly documents.

Abstract

We present the VIS30K dataset, a collection of 29,689 images that represents 30 years of figures and tables from each track of the IEEE Visualization conference series (Vis, SciVis, InfoVis, VAST). VIS30K's comprehensive coverage of the scientific literature in visualization not only reflects the progress of the field but also enables researchers to study the evolution of the state-of-the-art and to find relevant work based on graphical content. We describe the dataset and our semi-automatic collection process, which couples convolutional neural networks (CNN) with curation. Extracting figures and tables semi-automatically allows us to verify that no images are overlooked or extracted erroneously. To improve quality further, we engaged in a peer-search process for high-quality figures from early IEEE Visualization papers. With the resulting data, we also contribute VISImageNavigator (VIN, visimagenavigator.github.io), a web-based tool that facilitates searching and exploring VIS30K by author names, paper keywords, title and abstract, and years.

Paper Structure

This paper contains 15 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: A timeline of selected images from all 30 years (1990---2019) of IEEE Visualization conference showing diverse and trending research work. Best viewed electronically, zoomed in.
  • Figure 2: We extracted 29,689 images (26,776 figures, 2,913 tables) from the 2,916 IEEE Visualization conference papers, spanning 30 years (Vis: 13,509; SciVis: 3,232; InfoVis: 7,834; VAST: 5,114). Numbers for the joint conference are depicted as wide pale gray bars. The individual tracks are overlaid on top. On average, Vis/SciVis has more images per paper page than InfoVis and VAST.
  • Figure 3: The use of figures and tables shows great variation. Here, we place subfigures side-by-side for comparison to present different techniques, as in (a). Subfigures may not have subcaptions (b). They can be embedded (c) or contain tabular views of different parameter choices (d). Figure captions sometimes appear inside the figure's rectangular bounding box (e). Tables often contain visual separators, but the content can be hierarchical and can contain figures (f) or use table lenses (g). These variations lead us to retain composite figures and tables in our data cohort to preserve the functional values of these paper elements. All images © IEEE, used with permission.
  • Figure 4: Automatically rendered pseudo-paper pages in our training data generation with ground-truth labels. The text content in \ref{['fig:dummyPaperExample:a']} is grammatically correct but not semantically meaningful in the visualization domain. Page samples of \ref{['fig:dummyPaperExample:b']}, header, title, abstract, body text, figure, table, captions, and other document components are shown. We diversified the page layout structures to render pages both with and without images. When images are shown, they appear in single or double columns.
  • Figure 5: Fine-grained human recognition to correct CNN errors. The orange boxes show the machine prediction and the green boxes the human results to curate bounding box regions.
  • ...and 5 more figures