docs/use-flow-to-write-software-better.md
This is a practical, opinionated guide for using Flow as the control plane for software delivery, optimized for Claude Code and Codex.
The goal is simple: tighter feedback loops, fewer regressions, less context loss, and consistent quality gates.
Do not treat Flow as just a task runner. Use it as the enforced loop:
f commit with quality/testing/skill gates.If you do this consistently, team behavior becomes predictable and AI sessions become reliable.
Run these first:
f doctor
f auth login
f latest
What this gives you:
If you use fish integration heavily:
f shell-init
From the repository root:
f info
f tasks list
f setup
If project is not Flow-managed yet:
f init
Then immediately add these foundations to flow.toml:
[skills] and [skills.codex][commit.testing][commit.quality][commit.skill_gate]test, test-related, build, dev, deploy/ship)flow.toml pattern (AI-first, quality-enforced)Use this as a starting profile and adjust per repo:
version = 1
[project]
name = "your-project"
[skills]
sync_tasks = true
install = ["quality-feature-delivery"]
[skills.codex]
generate_openai_yaml = true
force_reload_after_sync = true
task_skill_allow_implicit_invocation = false
[[tasks]]
name = "test"
command = "<your test command>"
description = "Run project tests"
[[tasks]]
name = "test-related"
command = "<script that runs likely related tests>"
description = "Run smallest related tests for changed files"
[commit]
review_instructions_file = ".ai/commit-review-instructions.md"
[commit.testing]
mode = "block" # off | warn | block
runner = "bun" # Bun-first local gate
require_related_tests = true
ai_scratch_test_dir = ".ai/test" # optional gitignored AI scratch tests
run_ai_scratch_tests = true # run scratch tests when no related tracked tests
allow_ai_scratch_to_satisfy_gate = false
max_local_gate_seconds = 30
[commit.quality]
mode = "block"
require_docs = true
require_tests = true
auto_generate_docs = true
doc_level = "basic"
[commit.skill_gate]
mode = "block"
required = ["quality-feature-delivery"]
Why this matters:
sync_tasks + Codex skill generation makes tasks visible as skills.cd <repo>
f tasks list
Then choose one clear objective and one validation command before coding.
Prefer:
f dev
f test-related
f logs <task>
Avoid direct, inconsistent commands when equivalent Flow tasks exist.
Your prompt should include:
f commit without skip flags”Example prompt frame:
Implement X in Y files.
Run f test-related-main first, then broader tests if needed.
Update .ai/features for user-visible changes.
Commit using f commit with no skip flags.
Order of validation:
f test-related / branch-based variant)f commitf commit
This centralizes:
Do not bypass with --skip-quality or --skip-tests unless explicitly intentional.
.ai/features)Treat .ai/features/*.md as the source of truth for what exists.
Each user-visible feature should map to:
Why this is high leverage:
Use local skills for repo-specific “how we build here”.
Recommended minimum skill set:
f env only)Then enforce with:
[commit.skill_gate]
mode = "block"
required = ["quality-feature-delivery"]
This is how you convert good intentions into default behavior.
Use a script (like .ai/scripts/test-related.ts) that:
--base origin/main --head HEAD--list)If your runner fails due environment prerequisites (toolchain/vendor issues), add a preflight task:
f <runner>-readyf <runner>-fixThis avoids burning minutes before obvious infra failures.
When behavior is unclear, switch from “guess and patch” to “observe and patch”:
f logs <task>f trace / project-specific trace tasks)The best pattern is “capture once, reason once, patch once.”
Use f env as the single path for secrets and runtime env management:
f env setup
f env set KEY=value
f env pull
f env run <command>
Avoid ad hoc .env drift across machines.
For deployment or mobile shipping flows, define one confidence task that runs before release:
Then make ship task depend on that confidence task.
Example:
f mobile-confidence -> f mobile-ship
Result: broken pipelines fail before expensive release steps.
When adding Flow to an existing repo, use this order:
flow.toml with core tasks[storage] + f env flow).ai/features for top capabilitiesThis avoids destabilizing the team while still moving to enforcement.
Implement <feature> in <scope>.
Use Flow tasks only (no ad hoc commands when task exists).
Run related tests first, then broaden if risk warrants.
Update .ai/features for user-visible behavior changes.
Commit with f commit (no skip flags).
Do not patch yet.
Collect logs/traces via Flow tasks and summarize likely root causes.
Propose smallest validating experiment.
After confirmation, implement fix + related tests + feature doc update.
Commit via f commit.
Refactor <module> without behavior changes.
Keep public API stable.
Run focused tests proving no regression.
Document any non-obvious migration risks.
Commit via f commit.
f commit.f latest (if Flow changed frequently)f tasks listf ai / f codex / f claude resume context.ai/features)f commitf.ai/features as living capability mapAim to reach Level 3 quickly, then Level 4 where release speed and reliability both improve.
Use these defaults unless you have a reason not to:
commit.testing.mode = "block"commit.quality.mode = "block"commit.skill_gate.mode = "block"skills.sync_tasks = trueskills.codex.generate_openai_yaml = trueskills.codex.force_reload_after_sync = true--base origin/main --head HEAD)This gives the highest consistency with the least manual memory burden.
Flow works best when it is the enforced operating system for development, not an optional helper.
If you route implementation, testing, docs, commit review, and shipping through Flow, you get:
That is the path to writing software better, repeatedly.