Table of Contents
Fetching ...

Image captioning in different languages

Emiel van Miltenburg

TL;DR

This short position paper provides a manually curated list of non-English image captioning datasets (as of May 2024) and observes the dearth of datasets in different languages.

Abstract

This short position paper provides a manually curated list of non-English image captioning datasets (as of May 2024). Through this list, we can observe the dearth of datasets in different languages: only 23 different languages are represented. With the addition of the Crossmodal-3600 dataset (Thapliyal et al., 2022, 36 languages) this number increases somewhat, but still this number is small compared to the +/-500 institutional languages that are out there. This paper closes with some open questions for the field of Vision & Language.

Image captioning in different languages

TL;DR

This short position paper provides a manually curated list of non-English image captioning datasets (as of May 2024) and observes the dearth of datasets in different languages.

Abstract

This short position paper provides a manually curated list of non-English image captioning datasets (as of May 2024). Through this list, we can observe the dearth of datasets in different languages: only 23 different languages are represented. With the addition of the Crossmodal-3600 dataset (Thapliyal et al., 2022, 36 languages) this number increases somewhat, but still this number is small compared to the +/-500 institutional languages that are out there. This paper closes with some open questions for the field of Vision & Language.
Paper Structure (7 sections, 2 tables)

This paper contains 7 sections, 2 tables.