pkg/terminals/mcp/SKILL.md
Miller (mlr) is a command-line data processor for CSV, TSV, JSON, JSON
Lines, and other tabular/record formats, with SQL-like verbs (cut, sort,
join, stats1, ...) and an awk-like DSL (put, filter).
Work this loop. Each step exists to prevent a specific, common failure.
Everything valid is in the catalog; anything not in the catalog does not exist. Hallucinated flag/function names are the top failure mode.
which with e.g. "join two files on a key" → ranked
candidates. confident: true means a name matched; trust the top hit.list_capabilities with index: true → every
verb/function/flag/keyword with one-line summaries.list_capabilities with kind: "verb", names: ["join"] → the
full entry. Prefer the structured options list (flag, arg, type, enum
values) when present; usage_text is the prose fallback.(mlr_version, catalog_schema_version) — re-fetch only when either changes.Call describe_data on the input first. It returns, per field: name, types
seen with counts, occurrence count, null count, cardinality, min/max, and —
for low-cardinality fields — every distinct value.
describe_data; never guess casing or
spelling.-g (group-by) and DSL comparisons, use values from the
values array, not values you expect to exist.count is less than other fields' are absent in some records:
guard DSL with is_present($field).Before any run that includes put or filter, call validate_dsl with the
expression. Cost: parse-only, no data read. On valid: false, the error
document has kind, hint, and did_you_mean — apply the hint, don't
re-guess syntax.
Call run with argv as a list, one element per shell word (no shell quoting):
{"args": ["--icsv", "--ojson", "cat", "data.csv"]}
Command-line shape rules that prevent most argv errors:
mlr --icsv sort -f name f.csv.--icsv --ojson (separate in/out), --csv/--c2j etc. (combined).then: ["--icsv", "sort", "-f", "k", "then", "head", "-n", "3", "f.csv"].filter might collide with a verb flag,
end verb flags with -- before filenames.stdin_text; files go at the end of args.On failure, exit_code is nonzero and error (when present) carries kind,
hint, and did_you_mean — hint is often a corrected command line; prefer
executing it over reasoning from the message. stdout_truncated: true means
the output exceeded the server's cap: narrow the query (e.g. head, cut)
rather than re-running the same command.
run cannot execute external commands (DSL system/exec, piped
redirects, --prepipe) unless the server was started with --allow-shell;
such calls fail cleanly. It can write files via tee, split, and DSL
output redirects — treat it as a write-capable tool.describe_data + targeted verbs over dumping whole
files through run.