skills/README.md
An Agent Skill is a modular set of capabilities that extends the functional reach of a Large Language Model (LLM) within the AI Edge Gallery app. By giving the LLM new capabilities and domain-specific knowledge, skills reduce the need for repetitive prompt instructions, and eliminate the barriers for LLMs to discover and integrate new tools dynamically.
At a high level, each skill is defined by a SKILL.md file that contains
essential metadata and step-by-step instructions. When a user enters a prompt,
the LLM reviews the name and the descriptions of available skills appended to
its system prompt. If the user's request aligns with a skill, the LLM invokes it
automatically.
Unlike cloud-based LLMs that can spin up containers or access a terminal to run Python scripts or CLI tools, on-device LLMs operate within a sandboxed mobile environment. They cannot easily execute arbitrary system commands or local scripts due to security and resource constraints.
To overcome this, AI Edge Gallery adapts by focusing on two primary execution paths:
JavaScript Skills: Running logic inside a lightweight, hidden webview, which provides a cross-platform execution environment for custom logic.
Native App Intents: Leveraging the Android/iOS operating system's built-in capabilities (like sending email / text messages).
The simplest type of skill is a text-only skill, which provides the LLM with a specific persona or scenario data without requiring external code.
To create a skill, you must follow a standardized directory structure:
fitness-coach).fitness-coach/
└── SKILL.md
The core of the skill is the SKILL.md file. It must contain a frontmatter
metadata section enclosed by --- lines, followed by the instructions for
the LLM.
Example SKILL.md for a Text-Only Skill:
---
name: fitness-coach
description: A cheerful, high-energy fitness coach that provides motivational workout routines.
---
# Cheerful Fitness Coach
## Persona
You are an incredibly enthusiastic and supportive fitness coach! Your goal is
to make exercise feel like a party. Always use upbeat language, plenty of
encouraging emojis, and focus on the "fun" of moving your body.
## Instructions
When the user asks for a workout:
1. Start with a high-energy greeting (e.g., "Ready to crush it?").
2. Provide a 15-minute high-intensity routine that is easy to follow.
3. End with a massive "virtual high-five" and a reminder of how awesome they are
for showing up today! 🌟✨
The LLM uses the Name and Description in the metadata to determine if the skill is relevant to a user's query. If triggered, the Instructions are loaded into the model's context to guide its behavior.
Because Python is often unsuitable for on-device LLMs within mobile applications, the AI Edge Gallery uses JavaScript-based scripts housed in HTML files to execute custom logic.
JS skills execute logic by loading an HTML file into a hidden webview. The app
calls your skill's logic through a globally exposed asynchronous function named
ai_edge_gallery_get_result that must be attached to the window object.
The directory structure for a JS skill is the same as for text-only skills, but
with an extra scripts directory to put your index.html and related
JavaScript files.
Step 1: Create the directory structure
Your folder name must be in kebab-case and match your skill name.
my-js-skill/
├── SKILL.md
└── scripts/
└── index.html
Step 2: Write the SKILL.md file
You must explicitly instruct the LLM to call the run_js tool and define the
exact JSON schema it should pass as data.
---
name: my-js-skill
description: Calculate the hash of a given text.
---
# Calculate hash
## Instructions
Call the `run_js` tool with the following exact parameters:
- script name: index.html
- data: A JSON string with the following field:
- text: String. The text to calculate hash for.
[!TIP]
If your main entry point is named
index.html, thescript nameline in the instructions above is optional. The LLM will look forindex.htmlwithin thescripts/directory by default if no other file is specified.
Step 3: Create the index.html entry point
Embed your JavaScript logic inside scripts/index.html. You must define an
asynchronous function ai_edge_gallery_get_result and expose it on window.
This function receives a single argument, data, which is a stringified JSON
string passed from the app containing the parameters from the LLM, as described
in the SKILL.md instructions. Inside this function, you must parse this
data, execute your logic, and return a stringified JSON object. This
returned object must contain either a result field on success or an error
field on failure.
<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<script>
window['ai_edge_gallery_get_result'] = async (data) => {
try {
const jsonData = JSON.parse(data);
const processedData = await yourImplementation(jsonData.text);
return JSON.stringify({
result: processedData
});
} catch (e) {
console.error(e);
return JSON.stringify({
error: `Failed: ${e.message}`
});
}
};
async function yourImplementation(text) {
return text + " processed!";
}
</script>
</body>
</html>
[!TIP]
Think of
index.htmlas a "headless" execution environment that leverages the full power of the web ecosystem within a standard mobile webview. This setup allows you to move beyond basic scripts by makingfetch()calls to third-party APIs, integrating external libraries via CDN or relative paths in the<script>tag, and utilizing advanced Web APIs like WebAssembly. For more complex projects, you can maintain a clean architecture by splitting your logic into separate.jsfiles within thescripts/directory and importing them directly into your mainindex.htmlentry point.
To return an image to the chat, assign a base64 encoded string to the
image.base64 field in your returned JSON.
Example:
window['ai_edge_gallery_get_result'] = async (data) => {
try {
return JSON.stringify({
result: "Image generated.",
image: {
base64: "imageBase64String"
}
});
} catch (e) {
return JSON.stringify({
error: e.message
});
}
};
You can return an inline webview that the app will render in the chat. You can
specify a url (either absolute or relative to an assets folder) and an
aspectRatio (which defaults to 1.333 if omitted).
Example:
window['ai_edge_gallery_get_result'] = async (data) => {
try {
return JSON.stringify({
result: "Here is the interactive view.",
webview: {
url: "webview.html",
aspectRatio: 1.0
}
});
} catch (e) {
return JSON.stringify({
error: e.message
});
}
};
Here is how files should be organized:
my-interactive-skill/
├── SKILL.md
├── scripts/
│ └── index.html <-- The hidden logic runner
└── assets/
└── webview.html <-- The HTML rendered in the chat UI
[!TIP]
You can pass dynamic data from your background logic (
index.html) to your interactive UI (webview.html) by appending URL query parameters to the webview URL. In your script, construct the URL string to include key-value pairs, such aswebview.html?data=value. Your interactive page can then retrieve this information using theURLSearchParamsAPI to customize the user interface based on the LLM's output.
If your JS script requires an API key or token, do not pass it through the LLM prompt. Instead, the AI Edge Gallery app provides a secure mechanism: it will display a native dialog to the user to input the required secret when the JS skill is called, which is then passed directly to your script.
require-secret: true to your SKILL.md metadata.require-secret-description: some description to your
SKILL.md metadata. This will be shown in the prompt dialog.Example SKILL.md snippet:
---
name: some-api-skill
description: Fetches secure data.
metadata:
require-secret: true
require-secret-description: Go to Github settings page to copy your token.
---
Example index.html snippet:
window['ai_edge_gallery_get_result'] = async (data, secret) => {
try {
const jsonData = JSON.parse(data);
// Use the secret variable to authenticate your API call
const response = await fetch("https://api.example.com/data", {
headers: {
"Authorization": `Bearer ${secret}`
}
});
const resultText = await response.text();
return JSON.stringify({
result: resultText
});
} catch (e) {
return JSON.stringify({
error: e.message
});
}
};
Native skills map instructions to predefined tools in the Gallery app, such as
the run_intent tool. This allows the LLM to interact with the Android device
natively to perform actions like sending emails or text messages.
To use the run_intent tool, you must instruct the LLM to call it with two
exact parameters:
intent: The native action to run.parameters: A JSON string containing the required parameter values for the
intent.Example SKILL.md for Native Intents (Email and Text Message):
---
name: send-email
description: Send an email.
---
# Send email
## Instructions
Call the `run_intent` tool with the following exact parameters:
- intent: send_email
- parameters: A JSON string with the following fields:
- extra_email: the email address to send the email to. String.
- extra_subject: the subject of the email. String.
- extra_text: the body of the email. String.
[!IMPORTANT]
While the app currently supports sending email and sending text out of the box, supporting additional native intent-based skills requires updating the app's source code. To add new capabilities, such as opening the camera, setting alarms, etc., you must define the logic within the app's codebase. Developers can refer to IntentHandler.kt to see how existing intents are mapped and to learn how to register new custom intents for the LLM to invoke.
There are three ways to add a skill to the app:
We curated a list of skills contributed from our community. To try out a skill from this list, follow the steps below:
Steps:
Enter the Agent Skills use case with your selected model, and navigate to the Skill Manager by tapping the "Skills" chip.
Tap the (+) button and select the Add skill from featured list option.
From there, simply tap a skill from the list to automatically add it to the system.
For easier sharing, you can host your skill on a web server, and add the skill to the app by using the skill url.
Steps:
Enter the Agent Skills use case with your selected model, and navigate to the Skill Manager by tapping the "Skills" chip.
Tap the (+) button and select the Load skill from URL option.
Enter the skill url in the popup dialog. The url should be pointing to the skill folder itself.
Verify your URL: Ensure the URL is correct by loading the SKILL.md
file in your browser (e.g., https://your/url/SKILL.md). If the raw content
of the file displays correctly, your URL is ready to use (excluding the
SKILL.md suffix).
[!IMPORTANT]
To avoid webview loading failures, you must host your JS skill assets on a true web hosting service like GitHub Pages, Cloudflare, etc. Standard GitHub repository URLs and
raw.githubusercontent.comserve files as text/plain, which lacks the proper MIME types required for execution. Always use the deployment URL provided by your web host.
[!TIP]
A tip if you want to use GitHub Pages to serve your skills: By default, GitHub Pages uses Jekyll to process files, which can automatically convert .md files into .html. Because the AI Edge Gallery app requires access to the raw SKILL.md file to parse instructions, you must disable this behavior:
- Create an empty file named
.nojekyllin the root of your repository.- Commit and push this file to your main branch.
This ensures GitHub Pages serves your Markdown files as-is rather than attempting to render them as static webpages.
You can load skills directly from your Android device's file system.
Steps:
Connect your Android device to your computer and push your entire skill
folder (e.g., my-js-skill/) onto the device (e.g. to the Download
folder).
adb push my-js-skill/ /sdcard/Download/
Enter the Agent Skills use case with your selected model, and navigate to the Skill Manager by tapping the "Skills" chip.
Tap the (+) button and select the Import local skill option.
Use the Android file picker to select the directory containing your
SKILL.md file. The app will copy the directory into its internal storage
and make the skill available.
We've created a dedicated GitHub Discussions category for users to showcase their skills. Follow these steps to share your custom skills with the global AI Edge Gallery community:
Click "New discussion" button.
Follow the instructions and fill in the form to share your skill.
You can make your skill name clickable within the Skill Manager UI by adding a
homepage field to the metadata in your SKILL.md file. This is a great way
to link users to your GitHub repository, documentation, or personal website.
Example:
---
name: fitness-coach
description: A cheerful, high-energy fitness coach.
metadata:
homepage: https://github.com/your-username/fitness-coach-skill
---
When running a JavaScript skill, you can expand the execution panel to inspect the call details and the specific data passed to your script. This panel also provides access to real-time console logs.
Act as a dungeon master for a text-based adventure set in a world where everyone is a sentient kitchen appliance
Calculate the hash of a given text.
Query summary from Wikipedia for a given topic.
Generate QR code for a given url.
Show an interactive map view for the given location.
A simple mood tracking skill that stores and visualizes your daily mood and comments.
Show a virtual piano to play music
Spin the given text on my head.
Suggest or play music based on the user's mood, including analyzing images or audio.
Show a roulette wheel to allow user to randomly select a restaurant based on location and cuisine.
Send an email.
Check out more examples from our community-contributed skills.