Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab, M. Maruf, Arka Daw, Abhilash Neog, Harish Babu Manogaran, Mridul Khurana, Zhenyang Feng, Bahadir Altintas, Yasin Bakis, Elizabeth G Campolongo, Matthew J Thompson, Xiaojun Wang, Hilmar Lapp, Tanya Berger-Wolf, Paula Mabee, Henry Bart, Wei-Lun Chao, Wasila M Dahdul, Anuj Karpatne
TL;DR
Fish-Vista presents the first organismal image dataset specifically designed for analyzing externally visible visual traits in fishes, bridging a gap in biodiversity benchmarks by enabling trait-level analysis and localization. It introduces a fully reproducible data-processing pipeline that converts museum images into AI-ready crops with uniform backgrounds and trait annotations, supporting three downstream CV tasks: fine-grained species classification, trait identification, and trait segmentation. Comprehensive benchmarks across these tasks reveal challenges from extreme long-tailed distributions to generalization to unseen species and localization interpretability, highlighting opportunities for improved trait-grounded models and foundation-model adaptation in biology. The dataset and tasks hold strong promise for advancing AI-driven biodiversity science, enabling trait-centered explanations, robust trait localization, and integration with taxonomic and evolutionary knowledge.
Abstract
We introduce Fish-Visual Trait Analysis (Fish-Vista), the first organismal image dataset designed for the analysis of visual traits of aquatic species directly from images using problem formulations in computer vision. Fish-Vista contains 69,126 annotated images spanning 4,154 fish species, curated and organized to serve three downstream tasks of species classification, trait identification, and trait segmentation. Our work makes two key contributions. First, we perform a fully reproducible data processing pipeline to process images sourced from various museum collections. We annotate these images with carefully curated labels from biological databases and manual annotations to create an AI-ready dataset of visual traits, contributing to the advancement of AI in biodiversity science. Second, our proposed downstream tasks offer fertile grounds for novel computer vision research in addressing a variety of challenges such as long-tailed distributions, out-of-distribution generalization, learning with weak labels, explainable AI, and segmenting small objects. We benchmark the performance of several existing methods for our proposed tasks to expose future research opportunities in AI for biodiversity science problems involving visual traits.
