docs/customize/model-providers/more/lemonade.mdx
Lemonade Server provides optimized local LLM inference with support for GPU and NPU hardware acceleration. It offers an OpenAI-compatible API that seamlessly integrates with Continue and other open-source platforms.
Download and install Lemonade Server from lemonade-server.ai.
Lemonade Server is available directly in the Continue UI as a provider. You can select it from the model provider dropdown without manual configuration.
If you need custom settings, you can manually configure Lemonade:
<Tabs> <Tab title="YAML"> ```yaml title="config.yaml" name: My Config version: 0.0.1 schema: v1models:
- name: Lemonade
provider: lemonade
model: <MODEL_NAME>
apiBase: http://localhost:8000/api/v1/
```
</Tab>
<Tab title="JSON (Deprecated)">
```json title="config.json"
{
"models": [
{
"title": "Lemonade",
"provider": "lemonade",
"model": "<MODEL_NAME>",
"apiBase": "http://localhost:8000/api/v1/"
}
]
}
```
</Tab>
http://localhost:8000/api/v1/ by default)Lemonade Server automatically detects and optimizes for available hardware: