Table of Contents
Fetching ...

LLMs for Coding and Robotics Education

Peng Shu, Huaqin Zhao, Hanqi Jiang, Yiwei Li, Shaochen Xu, Yi Pan, Zihao Wu, Zhengliang Liu, Guoyu Lu, Le Guan, Gong Chen, Xianqiao Wang Tianming Liu

TL;DR

This study investigates how large language models (LLMs) and multimodal LLMs can support robot coding education for young learners by evaluating their performance on traditional coding tasks and robot-specific code generation, including block diagrams. The authors compare GPT-4V, Code Llama, and GitHub Copilot across LeetCode problems, FIRST Tech Challenge (FTC) code modules, FLL block diagrams, and visual programming tasks, using both textual and visual inputs. GPT-4V consistently outperforms the other models in most tasks, while all models struggle with generating direct block diagrams; textual intermediaries (pseudocode) can improve visual-programming code generation. The findings suggest a strong potential for AI-assisted robot education, while also highlighting limitations in multimodal understanding of visual-diagram content and the need for careful integration into curricula and teaching practice. The work provides a benchmark and insights to guide future development of AI tools that support coding and robotics education for children.

Abstract

Large language models and multimodal large language models have revolutionized artificial intelligence recently. An increasing number of regions are now embracing these advanced technologies. Within this context, robot coding education is garnering increasing attention. To teach young children how to code and compete in robot challenges, large language models are being utilized for robot code explanation, generation, and modification. In this paper, we highlight an important trend in robot coding education. We test several mainstream large language models on both traditional coding tasks and the more challenging task of robot code generation, which includes block diagrams. Our results show that GPT-4V outperforms other models in all of our tests but struggles with generating block diagram images.

LLMs for Coding and Robotics Education

TL;DR

This study investigates how large language models (LLMs) and multimodal LLMs can support robot coding education for young learners by evaluating their performance on traditional coding tasks and robot-specific code generation, including block diagrams. The authors compare GPT-4V, Code Llama, and GitHub Copilot across LeetCode problems, FIRST Tech Challenge (FTC) code modules, FLL block diagrams, and visual programming tasks, using both textual and visual inputs. GPT-4V consistently outperforms the other models in most tasks, while all models struggle with generating direct block diagrams; textual intermediaries (pseudocode) can improve visual-programming code generation. The findings suggest a strong potential for AI-assisted robot education, while also highlighting limitations in multimodal understanding of visual-diagram content and the need for careful integration into curricula and teaching practice. The work provides a benchmark and insights to guide future development of AI tools that support coding and robotics education for children.

Abstract

Large language models and multimodal large language models have revolutionized artificial intelligence recently. An increasing number of regions are now embracing these advanced technologies. Within this context, robot coding education is garnering increasing attention. To teach young children how to code and compete in robot challenges, large language models are being utilized for robot code explanation, generation, and modification. In this paper, we highlight an important trend in robot coding education. We test several mainstream large language models on both traditional coding tasks and the more challenging task of robot code generation, which includes block diagrams. Our results show that GPT-4V outperforms other models in all of our tests but struggles with generating block diagram images.
Paper Structure (16 sections, 6 figures, 1 table)

This paper contains 16 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: One example for FTC code generation test results.(a) represents the original code and (b) is the code generated by GPT-4V. The green backgrounds indicate correct parts.
  • Figure 2: Same example for FTC code generation test results.(a) represents the original code and (b) is the code generated by GitHub Copilot. The green backgrounds indicate correct parts.
  • Figure 3: One example for FLL block diagram explanation test results. The left code refers to the FLL block diagram while right part contains explanation from GPT-4V.
  • Figure 4: Same example for FLL block diagram explanation test results. The left code refers to the FLL block diagram while right part contains explanation from Microsoft Copilot. The yellow background indicates inaccurate description and red for wrong steps.
  • Figure 5: One example for FLL block diagram text pseudocode generation results.
  • ...and 1 more figures