docs/src/content/docs/reference/cli/index.md
| Subcommand | Purpose |
|---|---|
mistralrs serve | Start HTTP/MCP server and (optionally) the UI at /ui |
mistralrs run | Run model in interactive mode, or one-shot mode with -i |
mistralrs completions | Generate shell completions |
mistralrs quantize | Generate UQFF quantized model file |
mistralrs doctor | Run system diagnostics and environment checks |
mistralrs tune | Recommend quantization + device mapping for a model. Rejects --quant auto; pass --quant <level> or --isq <level> to bias the recommendation toward a specific quantization target |
mistralrs login | Authenticate with HuggingFace Hub |
mistralrs cache | Manage the HuggingFace model cache |
mistralrs bench | Run performance benchmarks for plain model generation |
mistralrs from-config | Run from a full TOML configuration file |
mistralrs update | Update a prebuilt install to the latest release |
mistralrs uninstall | Remove a prebuilt install |
| Option | Default | Description |
|---|---|---|
--seed <SEED> | Random seed for reproducibility | |
-l, --log <LOG> | Log all requests and responses to this file | |
--token-source <TOKEN_SOURCE> | cache | Token source for HuggingFace authentication. Formats: literal:<token>, env:<var>, path:<file>, cache, none |
-v, --verbose | 0 | Increase logging verbosity. Use -v for debug and -vv for trace-level internals |