docs/deployment/frameworks/anything-llm.md
AnythingLLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting.
It allows you to deploy a large language model (LLM) server with vLLM as the backend, which exposes OpenAI-compatible endpoints.
Set up the vLLM environment:
pip install vllm
Start the vLLM server with a supported chat-completion model, for example:
vllm serve Qwen/Qwen1.5-32B-Chat-AWQ --max-model-len 4096
Download and install AnythingLLM Desktop.
Configure the AI provider:
http://{vllm server host}:{vllm server port}/v1Qwen/Qwen1.5-32B-Chat-AWQCreate a workspace:
vllm) and start chatting.Add a document.
Chat using your document as context.