docs/source/examples/overview.rst
This section provides the tutorials for a curated list of example projects to help you learn how BentoML can be used for different scenarios. See the following lists for a complete collection of BentoML example projects. Browse through different categories to find the example that best suits your needs.
Deploy an OpenAI-compatible LLM API service with BentoML and vLLM:
DeepSeek R1 Distill of Llama 3.3 70B <https://github.com/bentoml/BentoVLLM/tree/main/deepseek-r1-llama3.3-70b>_Llama 4 Scout <https://github.com/bentoml/BentoVLLM/tree/main/llama4-17b-scout-instruct>_Mistral Small 24B <https://github.com/bentoml/BentoVLLM/tree/main/mistral-small-3.1-24b-instruct-2503>_BentoVLLM project <https://github.com/bentoml/BentoVLLM/#featured-models>_ to see more supported models.Customize your LLM inference runtime:
vLLM <https://github.com/bentoml/BentoVLLM>_TensorRT-LLM <https://github.com/bentoml/BentoTRTLLM>_LMDeploy <https://github.com/bentoml/BentoLMDeploy>_MLC-LLM <https://github.com/bentoml/BentoMLCLLM>_SGLang <https://github.com/bentoml/BentoSGLang>_Hugging Face TGI <https://github.com/bentoml/BentoTGI>_Triton Inference Server <https://github.com/bentoml/BentoTriton>_Build and scale compound AI systems with BentoML:
Agent: Function calling <https://github.com/bentoml/BentoFunctionCalling>_Agent: LangGraph <https://github.com/bentoml/BentoLangGraph>_Multi-agent: CrewAI <https://github.com/bentoml/BentoCrewAI>_LLM safety: ShieldGemma <https://github.com/bentoml/BentoShield/>_RAG: LlamaIndex <https://github.com/bentoml/rag-tutorials>_Voice assistants with Pipecat <https://github.com/bentoml/BentoVoiceAgent>_Phone agent with Twilio <https://github.com/bentoml/BentoTwilioConversationRelay>_Multi-LLM routing <https://github.com/bentoml/llm-router>_Serve text-to-image and image-to-image models with BentoML:
ComfyUI workflows as APIs <https://github.com/bentoml/comfy-pack>_Stable Diffusion 3.5 Large Turbo <https://github.com/bentoml/BentoDiffusion/tree/main/sd3.5-large-turbo>_Stable Diffusion 3 Medium <https://github.com/bentoml/BentoDiffusion/tree/main/sd3-medium>_Stable Diffusion XL Turbo <https://github.com/bentoml/BentoDiffusion/tree/main/sdxl-turbo>_ControlNet <https://github.com/bentoml/BentoDiffusion/tree/main/controlnet>_BentoDiffusion project <https://github.com/bentoml/BentoDiffusion>_ to see more examplesServe text-to-speech and speech-to-text models with BentoML:
ChatTTS <https://github.com/bentoml/BentoChatTTS>_XTTS <https://github.com/bentoml/BentoXTTS>_XTTS with a streaming endpoint <https://github.com/bentoml/BentoXTTSStreaming>_WhisperX <https://github.com/bentoml/BentoWhisperX>_Bark <https://github.com/bentoml/BentoBark>_Moshi <https://github.com/bentoml/BentoMoshi>_Serve computer vision models with BentoML:
YOLO: Object detection <https://github.com/bentoml/BentoYolo>_ResNet: Image classification <https://github.com/bentoml/BentoResnet>_EasyOCR: Optical character recognition <https://github.com/bentoml/BentoOCR>_Build embedding inference APIs with BentoML:
SentenceTransformers <https://github.com/bentoml/BentoSentenceTransformers>_CLIP <https://github.com/bentoml/BentoClip>_ColPali <https://github.com/bentoml/BentoColPali>_Serve custom models with BentoML:
MLflow <https://github.com/bentoml/BentoMLflow>_XGBoost <https://github.com/bentoml/BentoXGBoost>_BLIP inference API for image captioning and VQA (Visual Question Answering) <https://github.com/bentoml/BentoBlip>_Time-series forecasting with Moirai <https://github.com/bentoml/BentoMoirai/>_Time-series forecasting with Facebook Prophet <https://github.com/bentoml/BentoProphet>_