docs/ai-task-fast-path-guide.md
This guide is the practical playbook for running Flow MoonBit AI tasks at the lowest possible invocation latency.
It covers:
f vs faiFlow supports multiple runtime paths for .ai/tasks/*.mbt:
moon run path
FLOW_AI_TASK_RUNTIME=moon-run f ai:flow/dev-check
Highest flexibility, highest overhead.
Cached binary path through f
FLOW_AI_TASK_RUNTIME=cached f ai:flow/dev-check
Uses build cache, still pays full f process startup.
Daemon path through f
f tasks run-ai --daemon ai:flow/dev-check
Uses ai-taskd over Unix socket, still pays f process startup.
Fast daemon client (fai)
fai ai:flow/dev-check
Lowest invocation overhead for hot loops.
From ~/code/flow:
f install-ai-fast-client
f tasks daemon start
What this gives you:
~/.local/bin/fai installed (low-overhead client).ai-taskd running and warm (~/.flow/run/ai-taskd.sock).Verify:
which fai
fai --help
f tasks daemon status
fai ai:flow/noop
For always-on daemon across login sessions (recommended for stable latency):
f ai-taskd-launchd-install
f ai-taskd-launchd-status
fai pathfai sends a compact request to ~/.flow/run/ai-taskd.sockai-taskd resolves task selector (fast exact path first)ai-taskd reuses cached binary artifact when availableFLOW_AI_TASK_PROJECT_ROOT setFLOW_AI_TASKD_DISCOVERY_TTL_MS (default 750)FLOW_AI_TASKD_ARTIFACT_TTL_MS (default 1500)Moon version knobs:
FLOW_AI_TASK_MOON_VERSION (explicit override)FLOW_AI_TASK_MOON_VERSION_TTL_SECS (default 43200)Wire protocol knobs:
fai --protocol msgpack (default)fai --protocol json (compat / debugging)f with Fast Client Preferencef can optionally route AI task dispatch through the fast client when daemon mode is enabled.
Required:
export FLOW_AI_TASK_DAEMON=1
export FLOW_AI_TASK_FAST_CLIENT=1
Optional selector control:
export FLOW_AI_TASK_FAST_SELECTORS='ai:flow/noop,ai:flow/bench-cli,ai:project/*'
Optional client binary override:
export FLOW_AI_TASK_FAST_CLIENT_BIN="$HOME/.local/bin/fai"
Without FLOW_AI_TASK_FAST_CLIENT=1, f keeps normal daemon behavior.
fai CLI Usagefai [--root PATH] [--socket PATH] [--protocol json|msgpack] [--no-cache] [--capture-output] [--timings] <selector> [-- <args...>]
fai [--root PATH] [--socket PATH] [--protocol json|msgpack] [--no-cache] [--capture-output] [--timings] --batch-stdin
Examples:
fai ai:flow/noop
fai --root ~/code/flow ai:flow/bench-cli -- --iterations 50
fai --no-cache ai:flow/dev-check
fai --timings ai:flow/noop
printf 'ai:flow/noop\nai:flow/noop\n' | fai --batch-stdin
Notes:
--capture-output if you need command output returned through client response--timings to print server-side phase timings (resolve_us, run_us, total_us)--batch-stdin for pooled client bursts (single client process, multiple requests)Run baseline runtime benchmark:
f bench-ai-runtime --iterations 80 --warmup 10 --json-out /tmp/flow_ai_runtime.json
This includes:
moon_run_noopcached_noopdaemon_cached_noopcached_binary_directdaemon_client_noop (if ai-taskd-client binary is present)For focused hot-loop comparisons:
python3 - <<'PY'
import subprocess,time,statistics
from pathlib import Path
root=Path('~/code/flow').expanduser()
cases=[
('f_daemon',['./target/debug/f','tasks','run-ai','--daemon','ai:flow/noop']),
('fai',['fai','ai:flow/noop']),
('f_cached',['./target/debug/f','ai:flow/noop']),
]
for name,cmd in cases:
xs=[]
for i in range(60):
t0=time.perf_counter()
p=subprocess.run(cmd,cwd=root,stdout=subprocess.DEVNULL,stderr=subprocess.DEVNULL)
dt=(time.perf_counter()-t0)*1000
if p.returncode!=0: raise SystemExit((name,p.returncode))
if i>=10: xs.append(dt)
xs=sorted(xs)
pct=lambda p: xs[int((len(xs)-1)*p)]
print(name,'p50',round(pct(0.5),2),'p95',round(pct(0.95),2),'mean',round(statistics.mean(xs),2))
PY
Use this default profile for lowest latency:
export FLOW_AI_TASK_DAEMON=1
export FLOW_AI_TASK_FAST_CLIENT=1
export FLOW_AI_TASK_FAST_SELECTORS='ai:flow/*'
f tasks daemon start
Then:
fai ai:...f ai:... (auto fast-client when selectors match)fai says cannot connect to socketStart daemon:
f tasks daemon start
Check:
f tasks daemon status
ls -l ~/.flow/run/ai-taskd.sock
Or install persistent daemon:
f ai-taskd-launchd-install
faiUse full selector:
f tasks list | rg '^ai:'
fai ai:flow/dev-check
Check:
ps -Ao pcpu,pmem,comm | sort -k1 -nr | head -n 20
f tasks daemon status
Then rerun benchmark with warmup.
Use --capture-output on fai for output-capture parity.
fai --timings ai:flow/noop
FLOW_AI_TASKD_TIMINGS_LOG=1 f tasks daemon serve
Implemented in this iteration:
ai-taskd-launchd-*)msgpack) in fai + ai-taskdfai --batch-stdinfai --timings and daemon timing logsPotential next frontier: