Pretrained Embeddings as a Behavior Specification Mechanism
Parv Kapoor, Abigail Hammer, Ashish Kapoor, Karen Leung, Eunsuk Kang
TL;DR
This work addresses the challenge of formally specifying behaviors for AI-enabled systems that rely on perception by introducing embeddings as first-class objects in a specification language. It proposes Embedding Temporal Logic (ETL), allowing properties to be defined via distances between target and observed embeddings, and integrates pretrained vision models and world models to enable planning with embedding-based specifications. The paper defines ETL syntax, semantics, and quantitative satisfaction, and demonstrates through examples and preliminary experiments in navigation and manipulation that ETL-guided planning can steer systems toward desirable behaviors. The findings suggest embedding-based specifications broaden the scope of verifiable properties for AI systems and highlight practical considerations for distance metrics, target embedding specification, and future avenues in monitoring, verification, and explainability.
Abstract
We propose an approach to formally specifying the behavioral properties of systems that rely on a perception model for interactions with the physical world. The key idea is to introduce embeddings -- mathematical representations of a real-world concept -- as a first-class construct in a specification language, where properties are expressed in terms of distances between a pair of ideal and observed embeddings. To realize this approach, we propose a new type of temporal logic called Embedding Temporal Logic (ETL), and describe how it can be used to express a wider range of properties about AI-enabled systems than previously possible. We demonstrate the applicability of ETL through a preliminary evaluation involving planning tasks in robots that are driven by foundation models; the results are promising, showing that embedding-based specifications can be used to steer a system towards desirable behaviors.
