mcp/README.md
Google AI Edge Gallery leverages on-device machine learning models to deliver low-latency, privacy-preserving inference. However, standalone on-device models inherently lack access to real-time data, web services, and dynamic action execution. To solve this limitation, Google AI Edge Gallery integrates the Model Context Protocol (MCP), an open standard establishing secure, universal communication between AI models and external systems. By adopting this standardized client-server architecture, the app decouples its on-device models (the client) from external tools and data sources (the servers), creating a single unified interface for dynamic context retrieval and tool execution.
[!IMPORTANT] MCP integration is currently experimental.
In this section, we will walk through adding one of the official example MCP servers, fetch, to Google AI Edge Gallery.
Most open-source MCP servers are built exclusively with the stdio transport (learn more about MCP transport types), under the assumption that the MCP server and the LLM client (e.g., Gemini CLI, Claude Code, etc) run on the same local machine. However, this does not work natively for a mobile application like Google AI Edge Gallery.
To make the server accessible to Google AI Edge Gallery over the network, it needs to run in StreamableHTTP mode. Since the fetch example only supports stdio out of the box, we can use an adapter tool called supergateway to convert the stdio transport to StreamableHTTP without rewriting the server code.
Run the following commands in your terminal to set up and launch the server. Make sure python and node.js have been installed:
# Install the `fetch` MCP server.
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install mcp-server-fetch
# Start Supergateway.
$ npx -y supergateway --stdio 'mcp-server-fetch' --outputTransport streamableHttp
Now, the server is listening at http://localhost:8000/mcp.
Google AI Edge Gallery requires the local server to have a publicly routable URL to access it. If your host machine is already serving behind an HTTPS DNS address, you can skip this step.
Otherwise, you can expose the local port using a free tool like Cloudflare Quick Tunnels. First, install the cloudflared command-line tool on your machine, then run the following command in a separate terminal window:
# Ensure the target port matches the Supergateway service started in Step 1.
$ cloudflared tunnel --url http://localhost:8000
The command will output a unique public HTTPS URL (e.g., https://<random-string>.trycloudflare.com). When entering this address in the app, be sure to append the /mcp endpoint path.
Open Agent Chat in the app, select a model (we recommend Gemma-4-E4B for better model quality), and navigate to the Manage MCP servers screen by clicking the MCP button below the input text area.
Click Add MCP Server and enter your server URL. Make sure to append the /mcp endpoint path:
Once connected, the app will automatically detect the tools provided by the MCP server, and you can also toggle servers and tools.
[!WARNING] Please check out the official documentation of the
fetchMCP server and be aware of its limitations.
Unlike local development setups, cloud-hosted MCP servers are fully managed, run directly on external cloud infrastructure, and operate in StreamableHTTP mode by default. Because these services are exposed publicly, they require explicit authorization to control access.
Google AI Edge Gallery supports remote server authentication by allowing you to inject custom keys and credentials directly into the HTTP request headers. (We are working on supporting the full OAuth flow)
The following steps demonstrate how to connect the official Maps Grounding Lite cloud server (https://mapstools.googleapis.com/mcp) to the app. This service provides your on-device model with tools to query live geographical locations, weather conditions, and travel routes.
Before configuring the app, you need to set up a project and generate a valid credential to authorize requests to the Maps Grounding Lite API. Follow the steps on the official site.
https://mapstools.googleapis.com/mcp.X-Goog-Api-Keycompute_routes
Calculate the route from San Francisco to San Jose.
search_places
Recommend some highly rated Ramen places in downtown Mountain View CA.
[!WARNING] Due to model limitations, you might need to enable only the specific tool shown above the prompt and disable others for them to work. See the Limitations section below for more details.
While integrating MCP servers significantly expands the capabilities of on-device AI, users should keep the following constraints in mind during the experimental phase: