Table of Contents
Fetching ...

FaGeL: Fabric LLMs Agent empowered Embodied Intelligence Evolution with Autonomous Human-Machine Collaboration

Jia Liu, Min Chen

TL;DR

FaGeL fuses smart-fabric sensing with LLM reasoning to create a non-intrusive embodied agent that autonomously explores human needs and evolves through implicit feedback. The approach introduces DualCUT, an extension of Contrastive Unlikelihood Training, to achieve token-level AI alignment by leveraging both positive and negative textual feedback, with losses defined as $L_{CUT}=L_1+L_2$ and dynamic token scales. A token-level saliency visualization accompanies the evolution, enhancing interpretability of LLM fine-tuning. Empirical validation on Overcooked-AI shows FaGeL achieving an 11.3% improvement in a limited setting and faster adaptation with increased observation time, demonstrating practical potential for long-term human–machine collaboration. The work advances embodied intelligence by integrating fabric computing, implicit feedback, and fine-grained alignment toward scalable, autonomous human–AI collaboration in open physical environments.

Abstract

Recent advancements in Large Language Models (LLMs) have enhanced the reasoning capabilities of embodied agents, driving progress toward AGI-powered robotics. While LLMs have been applied to tasks like semantic reasoning and task generalization, their potential in open physical space exploration remains underexplored. This paper introduces FaGeL (Fabric aGent empowered by embodied intelligence with LLMs), an embodied agent integrating smart fabric technology for seamless, non-intrusive human-agent interaction. FaGeL autonomously generates tasks using multimodal data from wearable and ambient sensors, refining its behavior based on implicit human feedback in generated text, without explicit ratings or preferences. We also introduce a token-level saliency map to visualize LLM fine-tuning, enhancing the interpretability of token-level alignment. The system leverages dual feedback mechanisms to improve token-level alignment and addresses challenges in non-intrusive human-machine interaction and cognition evolution. Our contributions include FaGeL's development, the DualCUT algorithm for AI alignment, and experimental validation in cooperative tasks, demonstrating FaGeL's ability to adapt and evolve autonomously through implicit feedback. In the future, we plan to explore FaGeL's scalability in dynamic environments and its integration with other AI systems to develop AGI agents that adapt seamlessly to diverse human needs.

FaGeL: Fabric LLMs Agent empowered Embodied Intelligence Evolution with Autonomous Human-Machine Collaboration

TL;DR

FaGeL fuses smart-fabric sensing with LLM reasoning to create a non-intrusive embodied agent that autonomously explores human needs and evolves through implicit feedback. The approach introduces DualCUT, an extension of Contrastive Unlikelihood Training, to achieve token-level AI alignment by leveraging both positive and negative textual feedback, with losses defined as and dynamic token scales. A token-level saliency visualization accompanies the evolution, enhancing interpretability of LLM fine-tuning. Empirical validation on Overcooked-AI shows FaGeL achieving an 11.3% improvement in a limited setting and faster adaptation with increased observation time, demonstrating practical potential for long-term human–machine collaboration. The work advances embodied intelligence by integrating fabric computing, implicit feedback, and fine-grained alignment toward scalable, autonomous human–AI collaboration in open physical environments.

Abstract

Recent advancements in Large Language Models (LLMs) have enhanced the reasoning capabilities of embodied agents, driving progress toward AGI-powered robotics. While LLMs have been applied to tasks like semantic reasoning and task generalization, their potential in open physical space exploration remains underexplored. This paper introduces FaGeL (Fabric aGent empowered by embodied intelligence with LLMs), an embodied agent integrating smart fabric technology for seamless, non-intrusive human-agent interaction. FaGeL autonomously generates tasks using multimodal data from wearable and ambient sensors, refining its behavior based on implicit human feedback in generated text, without explicit ratings or preferences. We also introduce a token-level saliency map to visualize LLM fine-tuning, enhancing the interpretability of token-level alignment. The system leverages dual feedback mechanisms to improve token-level alignment and addresses challenges in non-intrusive human-machine interaction and cognition evolution. Our contributions include FaGeL's development, the DualCUT algorithm for AI alignment, and experimental validation in cooperative tasks, demonstrating FaGeL's ability to adapt and evolve autonomously through implicit feedback. In the future, we plan to explore FaGeL's scalability in dynamic environments and its integration with other AI systems to develop AGI agents that adapt seamlessly to diverse human needs.
Paper Structure (20 sections, 9 equations, 7 figures, 1 table)

This paper contains 20 sections, 9 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: The Fabric Agent Empowered by Embodied Intelligence with LLM (FaGeL) integrates (1) a sensing module empowered by wearable intelligence, equipped with a natural language describer; (2) an inference module composed of task mining, AI alignment, and embodied action decomposition; (3) an interaction module consisting of task execution and user feedback perception; (4) an embodied intelligence evolution module based on token-level AI alignment.
  • Figure 2: Architecture and Implementation of the Intelligence Evolution Module:(a) Functional components of FaGeL evolution algorithm; (b) The comparison of ProAgent algorithm and FaGeL evolution algorithm; (c) The advantages illustration of FaGeL; (d) Demonstration of the FaGeL evolution algorithm outperforming ProAgent using the Overcooked-AI platform.
  • Figure 3: Task space exploration across various daily life scenarios visualized using t-distributed Stochastic Neighbor Embedding (t-SNE). (a) The left shows a semantic visualization of 1000 collaborative tasks generated by the FaGeL's task mining algorithm. The right provides specific descriptions of three tasks located near each other in the task space, highlighting that semantic similarity is represented by spatial proximity. (b) Each circle represents a task output, with its text encoded by GPT-4 and reduced to a 2D plane using the t-SNE algorithm into a “semantic point” (s-point). This visualization also indicates that tasks generated by multiple agents within the same living environment exhibit certain semantic clustering characteristics.
  • Figure 4: Average timesteps per completion with an AI partner as observation time increases. A 'completion' represents delivering a fully prepared dish to the customer. The figure shows that within a single game session (lasting 400 timesteps), increasing observation time enables the agent to collaborate more effectively with the AI partner, resulting in shorter completion times.
  • Figure 5: Real-time scores during one episode in the Cramped Room scenario. The plot illustrates the real-time score accumulation for the ProAgent baseline and the FaGeL-evolution variants (with different numbers of rounds of evolution) during one complete episode.
  • ...and 2 more figures