docs/customize/model-providers/top-level/huggingfaceinference.mdx
Hugging Face is the main platform for sharing open AI models. It provides inference in two ways. Inference Providers and Inference Endpoints.
Inference Providers is a serverless service powered by external inference providers and routed through Hugging Face and paid per token.
<Info>You can access your access token from Hugging Face and prioritize your providers in settings.
</Info> <Tabs> <Tab title="YAML"> ```yaml title="config.yaml" name: My Config version: 0.0.1 schema: v1models: - name: deepseek provider: huggingface-inference-providers model: deepseek-ai/DeepSeek-V3.2-Exp apiKey: <YOUR_HF_TOKEN> apiBase: https://router.huggingface.co/v1
</Tab>
<Tab title="JSON (Deprecated)">
```json title="config.json"
{
"models": [
{
"title": "deepseek",
"provider": "huggingface-inference-providers",
"model": "deepseek-ai/DeepSeek-V3.2-Exp",
"apiKey": "<YOUR_HF_TOKEN>",
"apiBase": "https://router.huggingface.co/v1"
}
]
}
Inference Endpoints is a dedicated service that allows you to run your open models dedicated hardware. It is a more advanced way to get inference from Hugging Face models where you have more control over the whole process.
<Info>Before you can use Inference Endpoints, you need to create an endpoint. You can do this by going to Inference Endpoints and clicking on "Create Endpoint".
</Info> <Tabs> <Tab title="YAML"> ```yaml title="config.yaml" name: My Config version: 0.0.1 schema: v1models: - name: deepseek provider: huggingface-inference-endpoints model: <ENDPOINT_ID> apiKey: <YOUR_HF_TOKEN> apiBase: https://<YOUR_ENDPOINT_ID>.aws.endpoints.huggingface.cloud
</Tab>
<Tab title="JSON (Deprecated)">
```json title="config.json"
{
"models": [
{
"title": "deepseek",
"provider": "huggingface-inference-endpoints",
"model": "<ENDPOINT_ID>",
"apiKey": "<YOUR_HF_TOKEN>",
"apiBase": "https://<YOUR_ENDPOINT_ID>.aws.endpoints.huggingface.cloud"
}
]
}