guides/python/dspy-rlms/README.md
DSPy's RLM (Recursive Language Model) module gives an LLM a Python REPL. The LLM writes code, executes it, sees the output, and repeats — building up state across iterations until it calls SUBMIT() with a final answer. Within that code, the LLM can also call llm_query() to invoke sub-LLM reasoning (the "recursive" part), mixing procedural computation with natural-language understanding.
DaytonaInterpreter is a CodeInterpreter backend that runs all of this inside a Daytona cloud sandbox, so LLM-generated code never executes on the host.
llm_query() and llm_query_batched() are bridged into the sandbox, letting generated code invoke nested LLM reasoningDaytonaInterpreter supports with for automatic cleanupDAYTONA_API_KEY: Required for access to Daytona sandboxes. Get it from Daytona DashboardOPENROUTER_API_KEY, OPENAI_API_KEY, or ANTHROPIC_API_KEY: Required for your LLM provider (depending on which model you use)python3.10 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e .
To also run demo.py (which plots results with matplotlib), install with the demo extra:
pip install -e ".[demo]"
.env (copy from .env.example):cp .env.example .env
# Edit .env with your DAYTONA_API_KEY and LLM provider key
python demo.py
Each RLM call runs an iterative loop:
SUBMIT() or the iteration limit is reachedState persists across iterations — variables, imports, and function definitions all carry over. This lets the LLM explore data incrementally, inspect intermediate results with print(), and refine its approach before committing a final answer.
┌──────────────────────────────────┐
│ DSPy RLM │
│ │
│ Prompt LLM (inputs + history) │
│ │ │
│ ▼ │
│ LLM writes Python code │
│ │ │
│ ▼ │
│ Execute in sandbox ─────────────┼──▶ Daytona Sandbox
│ │ │ (persistent REPL)
│ ▼ │
│ Append output to history │
│ │ │
│ ▼ │
│ SUBMIT() called? ──no──▶ loop │
│ │ │
│ yes │
│ ▼ │
│ Return final answer │
└──────────────────────────────────┘
The generated code has access to two built-in functions for invoking an LLM from within the REPL:
llm_query(prompt) — send a single prompt, get a string backllm_query_batched(prompts) — send multiple prompts concurrentlyThis is what makes RLM "recursive": the LLM can write code that delegates semantic work to another LLM call, then processes the result with Python. For example:
texts = [page1, page2, page3]
summaries = llm_query_batched([f"Summarize: {t}" for t in texts])
combined = "\n".join(summaries)
SUBMIT(answer=combined)
These functions execute on the host (they need LLM API access). DaytonaInterpreter bridges them into the sandbox through the broker, the same mechanism used for custom tools.
You can pass host-side functions into the sandbox via the tools dict. The interpreter bridges them using a broker server that runs inside the sandbox:
From the LLM's perspective, these look like regular Python functions.
import dspy
from dotenv import load_dotenv
from daytona_interpreter import DaytonaInterpreter
load_dotenv()
# Configure the LLM
lm = dspy.LM("openrouter/anthropic/claude-sonnet-4.6")
dspy.configure(lm=lm)
# Create an RLM with the Daytona interpreter
interpreter = DaytonaInterpreter()
rlm = dspy.RLM(
signature="question -> answer: str",
interpreter=interpreter,
verbose=True,
)
result = rlm(question="What is the sum of the first 10 prime numbers?")
print(result.answer)
interpreter.shutdown()
Pass host-side functions into the sandbox so the LLM's generated code can call them:
import json
import dspy
from dotenv import load_dotenv
from daytona_interpreter import DaytonaInterpreter
load_dotenv()
lm = dspy.LM("openrouter/anthropic/claude-sonnet-4.6")
dspy.configure(lm=lm)
# Define tools that run on the host
def search_knowledge_base(query: str) -> str:
"""Search a knowledge base and return relevant results."""
# Replace with your actual search logic
return json.dumps({"results": [f"Result for: {query}"]})
# Pass tools to the interpreter
interpreter = DaytonaInterpreter(tools={"search_knowledge_base": search_knowledge_base})
rlm = dspy.RLM(
signature="question -> answer: str",
interpreter=interpreter,
verbose=True,
)
result = rlm(question="Search for information about Python generators and summarize it.")
print(result.answer)
interpreter.shutdown()
Inside the sandbox, the LLM can call search_knowledge_base(...) like a regular function. The call is routed to the host through the broker. See Custom tools for how this works.
SUBMIT() ends the REPL loop and returns a final answer. Its arguments match the output fields of the DSPy signature:
SUBMIT(answer="The sum is 129")
If the signature has typed output fields, SUBMIT gets a typed signature in the sandbox so the LLM knows the expected schema:
# Automatically generated:
def SUBMIT(answer: str):
...
If the LLM never calls SUBMIT() within the iteration limit, RLM falls back to extracting an answer from the REPL history.
The broker is a small Flask server inside the sandbox that bridges function calls between the sandbox and the host. It handles both RLM's built-in llm_query / llm_query_batched and any custom tools you provide. It starts automatically when tools are present.
See the main project LICENSE file for details.