showcase/shell-docs/src/content/docs/cookbook/daytona.mdx
Daytona gives AI agents isolated cloud sandboxes — full Linux runtimes with their own filesystem, network, and CPU/RAM. This recipe adds a server tool to CopilotKit's Built-in Agent that runs code inside one of those sandboxes and returns the result to the chat, so the agent can execute code it writes without touching your host.
This guide assumes you already have a running Built-in Agent app. If you don't, follow the Quickstart first — it takes a few minutes.
Add the key to your .env:
DAYTONA_API_KEY=your_daytona_api_key
npm install @daytonaio/sdk
Add a runCode server tool and register it on your BuiltInAgent. The
tool creates a fresh sandbox, runs the code, returns stdout, and cleans the sandbox up — so every call is
isolated.
import {
CopilotRuntime,
copilotRuntimeNextJSAppRouterEndpoint,
} from "@copilotkit/runtime";
import { BuiltInAgent, defineTool } from "@copilotkit/runtime/v2"; // [!code highlight]
import { Daytona } from "@daytonaio/sdk"; // [!code highlight]
import { NextRequest } from "next/server";
import { z } from "zod";
const daytona = new Daytona(); // reads DAYTONA_API_KEY
const runCode = defineTool({ // [!code highlight:20]
name: "runCode",
description: "Execute code in an isolated Daytona sandbox and return its output.",
parameters: z.object({
code: z.string().describe("The code to run in the sandbox"),
language: z
.enum(["python", "typescript", "javascript"])
.default("python")
.describe("Language runtime for the code"),
}),
execute: async ({ code, language }) => {
const sandbox = await daytona.create({ language });
try {
const res = await sandbox.process.codeRun(code);
return { stdout: res.result, exitCode: res.exitCode };
} finally {
await sandbox.delete();
}
},
});
const builtInAgent = new BuiltInAgent({
model: "openai:gpt-5.4-mini",
tools: [runCode], // [!code highlight]
maxSteps: 2, // required so the agent can call the tool and then respond // [!code highlight]
});
const runtime = new CopilotRuntime({
agents: { default: builtInAgent },
});
export const POST = async (req: NextRequest) => {
const { handleRequest } = copilotRuntimeNextJSAppRouterEndpoint({
runtime,
endpoint: "/api/copilotkit",
});
return handleRequest(req);
};
Start your app and ask the agent to run something. The tool runs Python by default, and the agent can
pass language to run TypeScript or JavaScript instead:
Run a Python snippet that prints the first 10 Fibonacci numbers.
Run JavaScript that logs the current timestamp.
The agent writes the code, calls runCode, and reports back the sandbox's output.
sandbox.stop() when idle.sandbox.getPreviewLink(port) to hand back a live
URL for an app the agent builds.codeRun covers Python, TypeScript, and JavaScript. For anything else (Go,
Rust, a shell pipeline), use sandbox.process.executeCommand(...) instead, optionally on a custom image
with the toolchain installed.See the Daytona docs for the full SDK.
Paste this prompt into your coding agent (Cursor, Claude Code, etc.) to add the integration to an existing Built-in Agent app:
In my existing CopilotKit Built-in Agent app, add a Daytona-backed code execution tool:
1. Install the Daytona SDK: `npm install @daytonaio/sdk`.
2. Add DAYTONA_API_KEY to my .env (I'll supply the value).
3. In my CopilotKit runtime route, create a Daytona client with `new Daytona()` and define a
server tool with `defineTool` from "@copilotkit/runtime/v2":
- name: "runCode"
- description: "Execute code in an isolated Daytona sandbox and return its output."
- parameters: a Zod object with a `code` string field and a `language` enum
("python" | "typescript" | "javascript", default "python").
- execute: create a sandbox with `daytona.create({ language })`, run the code with
`const res = await sandbox.process.codeRun(code)` — note that `codeRun` returns an
`ExecuteResponse` where stdout lives on `res.result` (not `res.stdout`), so return
`{ stdout: res.result, exitCode: res.exitCode }`. Call `sandbox.delete()` in a finally
block so the sandbox is cleaned up even on error.
4. Register the tool on my BuiltInAgent (`tools: [runCode]`) and set `maxSteps: 2` so it can call the
tool and then respond.
Keep my existing model and runtime setup unchanged.