MCP Bridge: A Lightweight, LLM-Agnostic RESTful Proxy for Model Context Protocol Servers
Arash Ahmadi, Sarah Sharif, Yaser M. Banad
TL;DR
MCP Bridge proposes a lightweight, LLMa-gnostic RESTful proxy that connects to multiple MCP servers to expose their capabilities via a unified API, addressing the limitations of STDIO-based MCP deployments in constrained environments. It combines a modular proxy design with a risk-based execution model and Docker isolation for high-risk tools, preserving backward compatibility. To enable open-weight deployment, the authors fine-tune Qwen3-4B and Qwen3-8B using four RL methods on Toucan-1.5M data, achieving competitive MCP-tool-call reliability on MCPToolBench++ and outperforming the GPT-OSS-120B baseline. The work demonstrates both the deployment viability of MCP across diverse platforms and the value of RL-aligned tool-use policies for model clients, supporting broader, safer AI agent capabilities in real-world environments.
Abstract
Large Language Models (LLMs) are increasingly augmented with external tools through standardized interfaces like the Model Context Protocol (MCP). However, current MCP implementations face critical limitations: they typically require local process execution through STDIO transports, making them impractical for resource-constrained environments like mobile devices, web browsers, and edge computing. We present MCP Bridge, a lightweight RESTful proxy that connects to multiple MCP servers and exposes their capabilities through a unified API. Unlike existing solutions, MCP Bridge is fully LLM-agnostic, supporting any backend regardless of vendor. The system implements a risk-based execution model with three security levels-standard execution, confirmation workflow, and Docker isolation - while maintaining backward compatibility with standard MCP clients. However, reliable execution within this framework requires models that can strictly adhere to protocol schemas. To this end, we also fine-tuned the Qwen3 4B and 8B model family on the Agent-Ark/Toucan-1.5M dataset using four Reinforcement Learning techniques: Group Relative Policy Optimization (GRPO), Dr. GRPO, Beta Normalization Policy Optimization (BNPO), and Decoupled Alignment Policy Optimization (DAPO). Evaluated on the MCPToolBench++ benchmark, our optimized model achieves an F1 score of 73.0% that outperforms GPT-OSS-120B (62.17%) and remains competitive with the 70B+ parameter baselines. Evaluation demonstrates that MCP Bridge successfully addresses the constraints of direct MCP connections while providing enhanced security controls and cross-platform compatibility, enabling sophisticated LLM-powered applications in previously inaccessible environments.
