Table of Contents
Fetching ...

Conversational AI Multi-Agent Interoperability, Universal Open APIs for Agentic Natural Language Multimodal Communications

Diego Gosmar, Deborah A. Dahl, Emmett Coin

TL;DR

The paper addresses fragmentation in conversational AI by proposing OVON, a universal open API framework for agentic natural language multimodal communications. It introduces the Interoperable Conversation Envelope and manifest-based discovery, supported by state-diagram modeling to map inter-agent exchanges across human and AI participants. The main contributions are a technology-agnostic, loosely coupled architecture with open repositories and a Sandbox for experimentation, demonstrated via end-to-end use cases like Smart Errands and Smart Library. It also outlines security, ethics, and accountability enhancements as future work to ensure safe and reliable cross-platform AI collaboration.

Abstract

This paper analyses Conversational AI multi-agent interoperability frameworks and describes the novel architecture proposed by the Open Voice Interoperability initiative (Linux Foundation AI and DATA), also known briefly as OVON (Open Voice Network). The new approach is illustrated, along with the main components, delineating the key benefits and use cases for deploying standard multi-modal AI agency (or agentic AI) communications. Beginning with Universal APIs based on Natural Language, the framework establishes and enables interoperable interactions among diverse Conversational AI agents, including chatbots, voicebots, videobots, and human agents. Furthermore, a new Discovery specification framework is introduced, designed to efficiently look up agents providing specific services and to obtain accurate information about these services through a standard Manifest publication, accessible via an extended set of Natural Language-based APIs. The main purpose of this contribution is to significantly enhance the capabilities and scalability of AI interactions across various platforms. The novel architecture for interoperable Conversational AI assistants is designed to generalize, being replicable and accessible via open repositories.

Conversational AI Multi-Agent Interoperability, Universal Open APIs for Agentic Natural Language Multimodal Communications

TL;DR

The paper addresses fragmentation in conversational AI by proposing OVON, a universal open API framework for agentic natural language multimodal communications. It introduces the Interoperable Conversation Envelope and manifest-based discovery, supported by state-diagram modeling to map inter-agent exchanges across human and AI participants. The main contributions are a technology-agnostic, loosely coupled architecture with open repositories and a Sandbox for experimentation, demonstrated via end-to-end use cases like Smart Errands and Smart Library. It also outlines security, ethics, and accountability enhancements as future work to ensure safe and reliable cross-platform AI collaboration.

Abstract

This paper analyses Conversational AI multi-agent interoperability frameworks and describes the novel architecture proposed by the Open Voice Interoperability initiative (Linux Foundation AI and DATA), also known briefly as OVON (Open Voice Network). The new approach is illustrated, along with the main components, delineating the key benefits and use cases for deploying standard multi-modal AI agency (or agentic AI) communications. Beginning with Universal APIs based on Natural Language, the framework establishes and enables interoperable interactions among diverse Conversational AI agents, including chatbots, voicebots, videobots, and human agents. Furthermore, a new Discovery specification framework is introduced, designed to efficiently look up agents providing specific services and to obtain accurate information about these services through a standard Manifest publication, accessible via an extended set of Natural Language-based APIs. The main purpose of this contribution is to significantly enhance the capabilities and scalability of AI interactions across various platforms. The novel architecture for interoperable Conversational AI assistants is designed to generalize, being replicable and accessible via open repositories.
Paper Structure (9 sections, 8 figures, 1 table)

This paper contains 9 sections, 8 figures, 1 table.

Figures (8)

  • Figure 1: States and events related to a serving agent when it is successful in finding a response
  • Figure 2: States and events related to a serving agent when it is unsuccessful in finding a response
  • Figure 3: Combined states and events related to a serving agent
  • Figure 4: Different states and transitions related to a demanding agent
  • Figure 5: Demanding agent looking for the Manifest details of a target agent
  • ...and 3 more figures