Natural Language Instructions for Intuitive Human Interaction with Robotic Assistants in Field Construction Work
Somin Park, Xi Wang, Carol C. Menassa, Vineet R. Kamat, Joyce Y. Chai
TL;DR
This work addresses enabling intuitive human robot collaboration in field construction through natural language instructions grounded in building component information. It proposes a three module framework (NLU, IM, RC) and validates it with a drywall installation case study, backed by a fine grained NL instruction dataset. Results show high instruction level accuracy, with BERT based models achieving near perfect performance given sufficient data, demonstrating the viability of spoken language interfaces for construction robots. The study contributes a practical architecture, a dedicated dataset, and insights into extending NL driven HRC to other pick and place tasks in construction.
Abstract
The introduction of robots is widely considered to have significant potential of alleviating the issues of worker shortage and stagnant productivity that afflict the construction industry. However, it is challenging to use fully automated robots in complex and unstructured construction sites. Human-Robot Collaboration (HRC) has shown promise of combining human workers' flexibility and robot assistants' physical abilities to jointly address the uncertainties inherent in construction work. When introducing HRC in construction, it is critical to recognize the importance of teamwork and supervision in field construction and establish a natural and intuitive communication system for the human workers and robotic assistants. Natural language-based interaction can enable intuitive and familiar communication with robots for human workers who are non-experts in robot programming. However, limited research has been conducted on this topic in construction. This paper proposes a framework to allow human workers to interact with construction robots based on natural language instructions. The proposed method consists of three stages: Natural Language Understanding (NLU), Information Mapping (IM), and Robot Control (RC). Natural language instructions are input to a language model to predict a tag for each word in the NLU module. The IM module uses the result of the NLU module and building component information to generate the final instructional output essential for a robot to acknowledge and perform the construction task. A case study for drywall installation is conducted to evaluate the proposed approach. The obtained results highlight the potential of using natural language-based interaction to replicate the communication that occurs between human workers within the context of human-robot teams.
