AS400-DET: Detection using Deep Learning Model for IBM i (AS/400)
Thanh Tran, Son T. Luu, Quan Bui, Shoshin Nomura
TL;DR
This work tackles automatic GUI component detection on IBM i (AS/400) screens to support automated testing of legacy enterprise interfaces. It introduces AS400-DET, a 1,050-image, human-annotated dataset with bounding boxes for components such as textlabel, textbox, option, table, instruction, keyboard, and commandline, including 381 Japanese instances, and benchmarks several state-of-the-art detectors. Among evaluated models, RTDETR-X achieves the highest accuracy across metrics with practical inference speeds (~0.63 s per image), while error analysis identifies a need for better context-aware and position-aware understanding of on-screen text. The study provides a concrete dataset and a robust detection framework that can be deployed for real-time GUI testing on IBM i systems and points to future work leveraging vision-language models to improve contextual interpretation of screen content.
Abstract
This paper proposes a method for automatic GUI component detection for the IBM i system (formerly and still more commonly known as AS/400). We introduce a human-annotated dataset consisting of 1,050 system screen images, in which 381 images are screenshots of IBM i system screens in Japanese. Each image contains multiple components, including text labels, text boxes, options, tables, instructions, keyboards, and command lines. We then develop a detection system based on state-of-the-art deep learning models and evaluate different approaches using our dataset. The experimental results demonstrate the effectiveness of our dataset in constructing a system for component detection from GUI screens. By automatically detecting GUI components from the screen, AS400-DET has the potential to perform automated testing on systems that operate via GUI screens.
