Table of Contents
Fetching ...

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)

Baoguang Shi, Cong Yao, Minghui Liao, Mingkun Yang, Pei Xu, Linyan Cui, Serge Belongie, Shijian Lu, Xiang Bai

TL;DR

This paper presents the ICDAR 2017 Competition on Reading Chinese Text in the Wild (RCTW-17) and the CTW-12k dataset, a large-scale Chinese scene-text corpus annotated with four-point polygons and UTF-8 transcripts. It defines two tasks—text localization and end-to-end recognition—and introduces polygon-based IoU and edit-distance–based evaluation protocols (AED/NED) to assess performance. The study reports submissions from 19 teams, analyzes top-performing approaches, and discusses common challenges such as long-text detection, perspective distortion, and confusable characters. The work establishes a foundation for Chinese text reading research in natural images and outlines plans for ongoing online evaluation and dataset refinement.

Abstract

Chinese is the most widely used language in the world. Algorithms that read Chinese text in natural images facilitate applications of various kinds. Despite the large potential value, datasets and competitions in the past primarily focus on English, which bares very different characteristics than Chinese. This report introduces RCTW, a new competition that focuses on Chinese text reading. The competition features a large-scale dataset with 12,263 annotated images. Two tasks, namely text localization and end-to-end recognition, are set up. The competition took place from January 20 to May 31, 2017. 23 valid submissions were received from 19 teams. This report includes dataset description, task definitions, evaluation protocols, and results summaries and analysis. Through this competition, we call for more future research on the Chinese text reading problem. The official website for the competition is http://rctw.vlrlab.net

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)

TL;DR

This paper presents the ICDAR 2017 Competition on Reading Chinese Text in the Wild (RCTW-17) and the CTW-12k dataset, a large-scale Chinese scene-text corpus annotated with four-point polygons and UTF-8 transcripts. It defines two tasks—text localization and end-to-end recognition—and introduces polygon-based IoU and edit-distance–based evaluation protocols (AED/NED) to assess performance. The study reports submissions from 19 teams, analyzes top-performing approaches, and discusses common challenges such as long-text detection, perspective distortion, and confusable characters. The work establishes a foundation for Chinese text reading research in natural images and outlines plans for ongoing online evaluation and dataset refinement.

Abstract

Chinese is the most widely used language in the world. Algorithms that read Chinese text in natural images facilitate applications of various kinds. Despite the large potential value, datasets and competitions in the past primarily focus on English, which bares very different characteristics than Chinese. This report introduces RCTW, a new competition that focuses on Chinese text reading. The competition features a large-scale dataset with 12,263 annotated images. Two tasks, namely text localization and end-to-end recognition, are set up. The competition took place from January 20 to May 31, 2017. 23 valid submissions were received from 19 teams. This report includes dataset description, task definitions, evaluation protocols, and results summaries and analysis. Through this competition, we call for more future research on the Chinese text reading problem. The official website for the competition is http://rctw.vlrlab.net

Paper Structure

This paper contains 12 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Example images and annotations of the CTW-12k dataset.
  • Figure 2: Intersection-over-Union (IoU) of two polygons. The red and green polygons are groundtruth and detection polygons respectively. The yellow area is their intersection. Union area is defined as the sum of the two polygon areas minus their intersection area. IoU is the ratio between intersection and union areas.
  • Figure 3: Summary of PR curves of the top-10 submissions. Each curve represents a team. Viewed in color.
  • Figure 4: Example detections from the submissions. Green polygons are correctly detected. Yellow ones are false detections.
  • Figure 5: Examples of recognition. Red characters are recognized wrongly.