docs/core/gemma-setup.md
gemini gemma — Automated Local Model Routing SetupLocal model routing uses a local Gemma 3 1B model running on your machine to classify and route user requests. It routes simple requests (like file reads) to Gemini Flash and complex requests (like architecture discussions) to Gemini Pro.
<!-- prettier-ignore -->[!NOTE] This is an experimental feature currently under active development.
This feature saves cloud API costs by using local inference for task classification instead of a cloud-based classifier. It adds a few milliseconds of local latency but can significantly reduce the overall token usage for hosted models.
# One command does everything: downloads runtime, pulls model, configures settings, starts server
gemini gemma setup
You'll be prompted to accept the Gemma Terms of Use. The model is ~1 GB.
After setup, just use the CLI normally — routing happens automatically on every request.
| Command | What it does |
|---|---|
gemini gemma setup | Full install (binary + model + settings + server start) |
gemini gemma status | Health check — shows what's installed and running |
gemini gemma start | Start the LiteRT server (auto-starts on CLI launch by default) |
gemini gemma stop | Stop the LiteRT server |
gemini gemma logs | Tail the server logs to see routing requests live |
/gemma | In-session status check (type it inside the CLI) |
gemini gemma status — all checks should show greengemini gemma logs (watch for incoming requests)/gemma slash command inside a session shows a quick status panelgemini gemma setup --port 8080 # custom port
gemini gemma setup --no-start # don't start server after install
gemini gemma setup --force # re-download everything
gemini gemma setup --skip-model # binary only, skip the 1GB model download
Set enabled: false in settings or just run gemini gemma stop to turn off the
server:
{ "experimental": { "gemmaModelRouter": { "enabled": false } } }
If you are in an environment where the gemini gemma setup command cannot
automatically download binaries (for example, behind a strict corporate
firewall), you can perform the setup manually.
For more information, see the Manual Local Model Routing Setup guide.