examples/sandbox/tutorials/repo_code_review/README.md
Review a small public git repository, run its tests, leave line-level review comments in the structured output, and write a patch-oriented review artifact.
This demo shows a coding-agent workflow where the sandbox can inspect a real
git worktree, run tests, reason over a diff, and produce review artifacts that a
developer can act on. The manifest mounts pypa/sampleproject at a pinned ref
with GitRepo(...).
The review contract is intentionally narrow: one finding should target the CI
workflow, and one should target the missing type hints in src/sample/simple.py.
Run the Unix-local example from the repository root:
uv run python examples/sandbox/tutorials/repo_code_review/main.py
uv run python examples/sandbox/tutorials/repo_code_review/evals.py
This demo exits after the scripted review so the generated artifacts and eval contract stay deterministic.
To run the same review in Docker, build the shared tutorial image once and pass
--docker:
docker build -t sandbox-tutorials:latest -f examples/sandbox/tutorials/Dockerfile .
uv run python examples/sandbox/tutorials/repo_code_review/main.py --docker
uv run python examples/sandbox/tutorials/repo_code_review/evals.py
output/review.mdoutput/findings.jsonloutput/fix.patchpypa/sampleproject at a pinned git ref, mounted into the workspace
as repo/.RepoReviewResult final output.scratchpad/ for notes or draft diffs,
then return the final review object for the wrapper to persist.evals.py checks that the two findings stay focused on uv in the
test workflow and type hints in src/sample/simple.py, and that the patch
only edits simple.py.