website/docs/references/models-http-api/huggingface.md
Hugging Face Inference Providers offers access to frontier open models from multiple providers through a unified API.
You'll need a Hugging Face account and an access token.
Hugging Face Inference Providers provides an OpenAI-compatible chat API interface. Here we use the MiniMaxAI/MiniMax-M2 model as an example.
[model.chat.http]
kind = "openai/chat"
model_name = "MiniMaxAI/MiniMax-M2" # specify the model you want to use
api_endpoint = "https://router.huggingface.co/v1"
api_key = "your-hf-token"
You can find a complete list of models supported by at least one provider on the Hub. You can also access these programmatically, see this guide for more details.
Hugging Face Inference Providers does not offer completion models (FIM) through their OpenAI-compatible API. For code completion, use a local model with Tabby.
While Hugging Face Inference Providers supports embeddings models, Tabby does not currently support the embeddings API interface for Hugging Face Inference Providers.