website/docs/MCP.md
WebdriverIO MCP is a Model Context Protocol (MCP) server that enables AI assistants like Claude Desktop and Claude Code to automate and interact with web browsers and mobile applications.
It provides a unified interface for:
through the @wdio/mcp package.
This allows AI assistants to:
:::info
NOTE For Mobile Apps Mobile automation requires a running Appium server with the appropriate drivers installed. See Prerequisites for setup instructions.
:::
The easiest way to use @wdio/mcp is via npx without any local installation:
npx @wdio/mcp
Or install it globally:
npm install -g @wdio/mcp
To use WebdriverIO MCP with Claude, modify the configuration file:
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp"]
}
}
}
After adding the configuration, restart Claude. The WebdriverIO MCP tools will be available for browser and mobile automation tasks.
Claude Code automatically detects MCP servers. You can configure it in your project's .claude/settings.json, or .mcp.json.
Or add it to .claude.json globally with executing:
claude mcp add --transport stdio wdio-mcp -- npx -y @wdio/mcp
Validate it by running the /mcp command inside claude code.
Ask Claude to automate browser tasks:
"Open Chrome and navigate to https://webdriver.io"
"Click the 'Get Started' button"
"Take a screenshot of the page"
"Find all visible links on the page"
Ask Claude to automate mobile apps:
"Start my iOS app on the iPhone 15 simulator"
"Tap the login button"
"Swipe up to scroll down"
"Take a screenshot of the current screen"
| Feature | Description |
|---|---|
| Session Management | Launch Chrome in headed/headless mode with custom dimensions and optional navigation URL |
| Navigation | Navigate to URLs |
| Element Interaction | Click elements, type text, find elements by various selectors |
| Page Analysis | Get visible elements (with pagination), accessibility tree (with filtering) |
| Screenshots | Capture screenshots (auto-optimized to max 1MB) |
| Scrolling | Scroll up/down by configurable pixel amounts |
| Cookie Management | Get, set, and delete cookies |
| Script Execution | Execute custom JavaScript in browser context |
| Feature | Description |
|---|---|
| Session Management | Launch apps on simulators, emulators, or real devices |
| Touch Gestures | Tap, swipe, drag and drop |
| Element Detection | Smart element detection with multiple locator strategies and pagination |
| App Lifecycle | Get app state (via execute_script for activate/terminate) |
| Context Switching | Switch between native and webview contexts in hybrid apps |
| Device Control | Rotate device, keyboard control |
| Geolocation | Get and set device GPS coordinates |
| Permissions | Automatic permission and alert handling |
| Script Execution | Execute Appium mobile commands (pressKey, deepLink, shell, etc.) |
xcode-select --install
npm install -g appium
appium driver install xcuitest
appium
export ANDROID_HOME=$HOME/Library/Android/sdk
export PATH=$PATH:$ANDROID_HOME/emulator
export PATH=$PATH:$ANDROID_HOME/platform-tools
npm install -g appium
appium driver install uiautomator2
appium
WebdriverIO MCP acts as a bridge between AI assistants and browser/mobile automation:
āāāāāāāāāāāāāāāāāāā MCP Protocol āāāāāāāāāāāāāāāāāāā
ā Claude Desktop ā āāāāāāāāāāāāāāāāāāāāŗ ā @wdio/mcp ā
ā or Claude Code ā (stdio) ā Server ā
āāāāāāāāāāāāāāāāāāā āāāāāāāāāā¬āāāāāāāāā
ā
WebDriverIO API
ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā ā ā
āāāāāāāāā¼āāāāāāāā āāāāāāāāā¼āāāāāāāā āāāāāāāāā¼āāāāāāāā
ā Chrome ā ā Appium ā ā Appium ā
ā (Browser) ā ā (iOS) ā ā (Android) ā
āāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāā
noReset: true) automatically detach on closeThe MCP server supports multiple selector strategies. See Selectors for detailed documentation.
# CSS Selectors
button.my-class
#element-id
[data-testid="login"]
# XPath
//button[@class='submit']
//a[contains(text(), 'Click')]
# Text Selectors (WebdriverIO specific)
button=Exact Button Text
a*=Partial Link Text
# Accessibility ID (recommended - works on iOS & Android)
~loginButton
# Android UiAutomator
android=new UiSelector().text("Login")
# iOS Predicate String
-ios predicate string:label == "Login"
# iOS Class Chain
-ios class chain:**/XCUIElementTypeButton[`label == "Login"`]
# XPath (works on both platforms)
//android.widget.Button[@text="Login"]
//XCUIElementTypeButton[@label="Login"]
The MCP server provides 25 tools for browser and mobile automation. See Tools for the complete reference.
| Tool | Description |
|---|---|
start_browser | Launch Chrome browser (with optional initial URL) |
close_session | Close or detach from session |
navigate | Navigate to a URL |
click_element | Click an element |
set_value | Type text into input |
get_visible_elements | Get visible/interactable elements (with pagination) |
get_accessibility | Get accessibility tree (with filtering) |
take_screenshot | Capture screenshot (auto-optimized) |
scroll | Scroll the page up or down |
get_cookies / set_cookie / delete_cookies | Cookie management |
execute_script | Execute JavaScript in browser |
| Tool | Description |
|---|---|
start_app_session | Launch iOS/Android app |
tap_element | Tap element or coordinates |
swipe | Swipe in a direction |
drag_and_drop | Drag between locations |
get_app_state | Check if app is running |
get_contexts / switch_context | Hybrid app context switching |
rotate_device | Rotate to portrait/landscape |
get_geolocation / set_geolocation | Get or set GPS coordinates |
hide_keyboard | Dismiss on-screen keyboard |
execute_script | Execute Appium mobile commands |
By default, the MCP server automatically grants app permissions (autoGrantPermissions: true), eliminating the need to manually handle permission dialogs during automation.
System alerts (like "Allow notifications?") are automatically accepted by default (autoAcceptAlerts: true). This can be configured to dismiss instead with autoDismissAlerts: true.
Configure the Appium server connection:
| Variable | Default | Description |
|---|---|---|
APPIUM_URL | 127.0.0.1 | Appium server hostname |
APPIUM_URL_PORT | 4723 | Appium server port |
APPIUM_PATH | / | Appium server path |
{
"mcpServers": {
"wdio-mcp": {
"command": "npx",
"args": ["-y", "@wdio/mcp"],
"env": {
"APPIUM_URL": "192.168.1.100",
"APPIUM_URL_PORT": "4724"
}
}
}
}
The MCP server is optimized for efficient AI assistant communication:
The MCP server is written in TypeScript and includes full type definitions. If you're extending or integrating with the server programmatically, you'll benefit from auto-completion and type safety.
All tools are designed with robust error handling:
appium)appium driver list)xcrun simctl list devices)adb devices)ANDROID_HOME environment variable is set