docs/architecture/WORKING.md
This is a high level overview of how Vane answers a question.
If you want a component level overview, see README.md.
If you want implementation details, see CONTRIBUTING.md.
When you send a message in the UI, the app calls POST /api/chat.
At a high level, we do three things:
Before searching or answering, we run a classification step.
This step decides things like:
Widgets are small, structured helpers that can run alongside research.
Examples include weather, stocks, and simple calculations.
If a widget is relevant, we show it in the UI while the answer is still being generated.
Widgets are helpful context for the answer, but they are not part of what the model should cite.
If research is needed, we gather information in the background while widgets can run.
Depending on configuration, research may include web lookup and searching user uploaded files.
Once we have enough context, the chat model generates the final response.
You can control the tradeoff between speed and quality using optimizationMode:
speedbalancedqualityWe prompt the model to cite the references it used. The UI then renders those citations alongside the supporting links.
If you are integrating Vane into another product, you can call POST /api/search.
It returns:
message: the generated answersources: supporting references used for the answerYou can also enable streaming by setting stream: true.
Image and video search use separate endpoints (POST /api/images and POST /api/videos). We generate a focused query using the chat model, then fetch matching results from a search backend.