multimodal/websites/main/src/docs/source/blog/mcp-brings-a-new-paradigm-to-layered-ai-app-development.md
[!TIP] > Preface: While MCP is a hot topic, its most critical aspect decoupling tool providers from application developers through standardized protocols has been overlooked. This shift mirrors the frontend-backend separation in web development, representing a paradigm shift in AI Agent application development.
Using the development of the Agent TARS application as an example, this article details MCP's role in transforming development paradigms and expanding tool ecosystems.
| Term | Definition |
|---|---|
| AI Agent | In the context of LLMs, an AI Agent is an autonomous entity capable of understanding intent, planning decisions, and executing complex tasks. Unlike ChatGPT, it doesn't just advise "how to do" but actively "does it for you." If Copilot is a co-pilot, an Agent is the pilot. Mimicking human task execution, its core functionality revolves around three iterative steps: Perception, Planning, and Action. |
| Copilot | An AI-powered assistant integrated into software to enhance productivity by analyzing user behavior, input, and history to provide real-time suggestions or automate tasks. |
| MCP | Model Context Protocol is an open standard governing how applications provide context to LLMs. Think of it as a USB-C port for AI—enabling standardized connections between models and external data/tools. |
| Agent TARS | An open-source multimodal AI agent seamlessly integrating with real-world tools. |
| RESTful API | An architectural style for client-server interaction, based on design principles rather than strict standards. |
AI has evolved from dialogue-only Chatbots to decision-supporting Copilots and now autonomous Agents, demanding richer context and tools for task execution.
The lack of standardized context and tooling creates three major challenges:
"All problems in computer science can be solved by another level of indirection" -- Butler Lampson
MCP decouples tools into a dedicated MCP Server layer, standardizing development and invocation. MCP Servers provide Agents with standardized access to context and tools.
<p align="center"> </p>Three examples showcasing MCP's role in AI Agent applications:
| Instruction | Demo | MCP Servers Used | Notes |
|---|---|---|---|
| Analyze a stock technically, then buy 3 shares at market price | Replay | Broker MCP, Filesystem MCP | Uses simulated trading account. |
| What are my machine's CPU, memory, and network speeds? | Replay | CLI MCP, Code Exec MCP | |
| Find top 5 upvoted products on ProductHunt | Replay | Browser MCP |
Current MCP customization is closed; third-party MCP Servers were manually mounted for testing. More: https://agent-tars.com/showcase
Model Context Protocol is a standard protocol by Anthropic for LLM-to-external communication (data/tools), based on JSON-RPC 2.0. It acts as a USB-C interface for AI, standardizing context provision.
<p align="center"> </p>MCP supplies LLMs with three context types: Resources, Prompts, and Tools.
<p align="center"> </p>| MCP | Function Call | |
|---|---|---|
| Definition | Standard interface for model-device integration (Tools/Resources/Prompts). | Flat tool listing for external data access. MCP Tools enforce input/output protocols. |
| Protocol | JSON-RPC (bidirectional, discoverable, notifications). | JSON-Schema (static). |
| Invocation | Stdio/SSE/in-process calls. | In-process/language-native functions. |
| Use Case | Dynamic, complex interactions. | Single-tool, static executions. |
| Integration | Complex. | Simple. |
| Engineering | High maturity. | Low maturity. |
Early web development coupled UIs with backend logic (JSP/PHP), mirroring today's Agent-tool entanglement. AJAX/Node.js/RESTful APIs enabled separation; MCP now does the same for AI:
This layering lets Agent developers compose tools like building blocks.
<p align="center"> </p>The MCP Browser Tool exemplifies the implementation. To ensure out-of-box usability (avoiding Node.js/UV dependencies per issue#64), we categorize tools as:
Taking mcp-server-browser as an example, it is essentially an npm package with the following package.json configuration:
{
"name": "mcp-server-browser",
"version": "0.0.1",
"type": "module",
"bin": {
"mcp-server-browser": "dist/index.cjs"
},
"main": "dist/server.cjs",
"module": "dist/server.js",
"types": "dist/server.d.ts",
"files": ["dist"],
"scripts": {
"build": "rm -rf dist && rslib build && shx chmod +x dist/*.{js,cjs}",
"dev": "npx -y @modelcontextprotocol/inspector tsx src/index.ts"
}
}
bin: Stdio entrymain / module: In-process function call entryIn practice, using the Inspector to develop and debug MCP Servers proves highly effective. By decoupling Agents from tools, developers can debug and develop tools independently.
Simply run npm run dev to launch a Playground containing all debuggable MCP Server features (Prompts, Resources, Tools, etc.).
$ npx -y @modelcontextprotocol/inspector tsx src/index.ts
Starting MCP inspector...
New SSE connection
Spawned stdio transport
Connected MCP client to backing server transport
Created web app transport
Set up MCP proxy
🔍 MCP Inspector is up and running at http://localhost:5173 🚀
Note: console.log doesn't work in Inspector—debugging requires alternatives.
To enable in-process function calls for built-in MCP Servers, we export three shared methods in the entry file src/server.ts:
listTools: Enumerates all available functionscallTool: Invokes specific functionsclose: Cleanup function when server is no longer needed// src/server.ts
export const client: Pick<Client, 'callTool' | 'listTools' | 'close'> = {
callTool,
listTools,
close,
};
For Stdio call support, simply import the module in src/index.ts:
#!/usr/bin/env node
// src/index.ts
import { client as mcpBrowserClient } from "./server.js";
const server = new Server(
{
name: "example-servers/puppeteer",
version: "0.1.0",
},
{
capabilities: {
tools: {},
},
}
);
// listTools
server.setRequestHandler(ListToolsRequestSchema, mcpBrowserClient.listTools);
// callTool
server.setRequestHandler(CallToolRequestSchema, async (request) =>
return await mcpBrowserClient.callTool(request.params);
);
async function runServer() {
const transport = new StdioServerTransport();
await server.connect(transport);
}
runServer().catch(console.error);
process.stdin.on("close", () => {
console.error("Browser MCP Server closed");
server.close();
});
The MCP protocol requires using JSON Schema to constrain tool inputs and outputs. Based on practical experience, we recommend using zod to define a Zod Schema set, which is then converted to JSON Schema for MCP export.
import { z } from 'zod';
const toolsMap = {
browser_navigate: {
description: 'Navigate to a URL',
inputSchema: z.object({
url: z.string(),
}),
handle: async (args) => {
// Implements
const clickableElements = ['...']
return {
content: [
{
type: 'text',
text: `Navigated to ${args.url}\nclickable elements: ${clickableElements}`,
},
],
isError: false,
}
}
},
browser_scroll: {
name: 'browser_scroll',
description: 'Scroll the page',
inputSchema: z.object({
amount: z
.number()
.describe('Pixels to scroll (positive for down, negative for up)'),
}),
handle: async (args) => {
return {
content: [
{
type: 'text',
text: `Scrolled ${actualScroll} pixels. ${
isAtBottom
? 'Reached the bottom of the page.'
: 'Did not reach the bottom of the page.'
}`,
},
],
isError: false,
};
}
},
// more
};
const callTool = async ({ name, arguments: toolArgs }) => {
return handlers[name].handle(toolArgs);
}
Pro Tip: Unlike OpenAPI's structured data returns, MCP responses are specifically designed for LLM models. To better bridge models and tools, returned text and tool descriptions should be more semantic, improving model comprehension and tool invocation success rates. For example,
browser_scrollshould return page scroll status after each execution (e.g., remaining pixels to bottom, whether bottom reached). This enables models to provide more precise parameters in subsequent calls.
After developing the MCP Server, it needs to be integrated into the Agent application. In principle, the Agent shouldn't need to concern itself with the specific details of tools, inputs, and outputs provided by MCP Servers.
MCP Servers configuration is divided into "Built-in Servers" and "User Extension Servers". Built-in Servers use in-process Function Calls to ensure out-of-the-box functionality for novice users, while Extension Servers provide advanced users with expanded Agent capabilities.
{
// Internal MCP Servers(in-process call)
fileSystem: {
name: 'fileSystem',
localClient: mcpFsClient,
},
commands: {
name: 'commands',
localClient: mcpCommandClient,
},
browser: {
name: 'browser',
localClient: mcpBrowserClient,
},
// External MCP Servers(remote call)
fetch: {
command: 'uvx',
args: ['mcp-server-fetch'],
},
longbridge: {
command: 'longport-mcp',
args: [],
env: {}
}
}
The core mission of the MCP Client is to integrate MCP Servers with different invocation methods (Stdio/SSE/Function Call). The Stdio and SSE implementations directly reuse the Official Examples. Here we focus on how Function Call support was implemented in the Client.
export type MCPServer<ServerNames extends string = string> = {
name: ServerNames;
status: 'activate' | 'error';
description?: string;
env?: Record<string, string>;
+ /** same-process call, same as function call */
+ localClient?: Pick<Client, 'callTool' | 'listTools' | 'close'>;
/** Stdio server */
command?: string;
args?: string[];
};
The MCP Client invocation works as follows:
import { client as mcpBrowserClient } from '@agent-infra/mcp-server-browser';
const client = new MCPClient([
{
name: 'browser',
description: 'web browser tools',
localClient: mcpBrowserClient,
}
]);
const mcpTools = await client.listTools();
const response = await openai.chat.completions.create({
model,
messages,
// Different model vendors need to convert to the corresponding tools data format.
tools: convertToTools(tools),
tool_choice: 'auto',
});
At this point, the entire MCP workflow has been fully implemented, covering all aspects from Server configuration, Client integration to Agent connectivity. More MCP details/code have been open-sourced on GitHub: Agent Integration, mcp-client, mcp-servers
The MCP ecosystem continues to grow, with increasing applications supporting MCP and open platforms providing MCP Servers. Services like Cloudflare and Composio, Zapier use SSE to host MCP (i.e., connecting one MCP Endpoint grants access to multiple MCP Servers). The ideal scenario for Stdio implementation would be running MCP Servers and Agent systems within the same Docker container.
<p align="center"> </p>