SciPostLayoutTree: A Dataset for Structural Analysis of Scientific Posters
Shohei Tanaka, Atsushi Hashimoto, Yoshitaka Ushiku
TL;DR
SciPostLayoutTree addresses the need to analyze the structural organization of scientific posters by introducing a DFS-ordered tree annotation scheme over poster BBoxes and presenting a Layout Tree Decoder that fuses visual features with BBox coordinates and category embeddings. The dataset comprises about 7,851 posters, revealing frequent spatially challenging relations (upward, horizontal, long-distance) not common in document datasets. The proposed decoder uses beam search to capture sequence-level plausibility and bbox embeddings to improve parent–child predictions, achieving consistent gains across backbones in reading-order and hierarchical predictions. The work provides a solid baseline for poster-structure analysis and makes both dataset and code publicly available, enabling broader development of structure-aware poster interfaces and accessibility tools.
Abstract
Scientific posters play a vital role in academic communication by presenting ideas through visual summaries. Analyzing reading order and parent-child relations of posters is essential for building structure-aware interfaces that facilitate clear and accurate understanding of research content. Despite their prevalence in academic communication, posters remain underexplored in structural analysis research, which has primarily focused on papers. To address this gap, we constructed SciPostLayoutTree, a dataset of approximately 8,000 posters annotated with reading order and parent-child relations. Compared to an existing structural analysis dataset, SciPostLayoutTree contains more instances of spatially challenging relations, including upward, horizontal, and long-distance relations. As a solution to these challenges, we develop Layout Tree Decoder, which incorporates visual features as well as bounding box features including position and category information. The model also uses beam search to predict relations while capturing sequence-level plausibility. Experimental results demonstrate that our model improves the prediction accuracy for spatially challenging relations and establishes a solid baseline for poster structure analysis. The dataset is publicly available at https://huggingface.co/datasets/omron-sinicx/scipostlayouttree. The code is also publicly available at https://github.com/omron-sinicx/scipostlayouttree.
