docs/development/basic/chat-api.mdx
This document explains the implementation logic of LobeHub Chat API in client-server interactions, including event sequences and core components involved.
sequenceDiagram
participant Client as Frontend Client
participant AgentLoop as Agent Runtime Loop
participant ChatService as ChatService
participant ChatAPI as Backend Chat API
participant ModelRuntime as Model Runtime
participant ModelProvider as Model Provider API
participant ToolExecution as Tool Execution Layer
Client->>AgentLoop: sendMessage()
Note over AgentLoop: Create GeneralChatAgent + AgentRuntime
loop Agent Plan-Execute Loop
AgentLoop->>AgentLoop: Agent decides next instruction
alt call_llm instruction
AgentLoop->>ChatService: getChatCompletion
Note over ChatService: Context engineering
ChatService->>ChatAPI: POST /webapi/chat/[provider]
ChatAPI->>ModelRuntime: Initialize ModelRuntime
ModelRuntime->>ModelProvider: Chat completion request
ModelProvider-->>ChatService: Stream back SSE response
ChatService-->>Client: onMessageHandle callback
else call_tool instruction
AgentLoop->>ToolExecution: Execute tool
Note over ToolExecution: Builtin / MCP / Plugin
ToolExecution-->>AgentLoop: Return tool result
else request_human_* instruction
AgentLoop-->>Client: Request user intervention
Client->>AgentLoop: User feedback
else finish instruction
AgentLoop-->>Client: onFinish callback
end
end
Note over Client,ModelProvider: Preset task scenario (bypasses Agent loop)
Client->>ChatService: fetchPresetTaskResult
ChatService->>ChatAPI: Send preset task request
ChatAPI-->>ChatService: Return task result
ChatService-->>Client: Return result via callback
After the user sends a message, sendMessage()
(src/store/chat/slices/aiChat/actions/conversationLifecycle.ts)
creates the user message and assistant message placeholder,
then calls internal_execAgentRuntime().
Agent Runtime is the core execution engine
of the entire chat flow. Every chat interaction
(from simple Q&A to complex multi-step tool calling)
is driven by the AgentRuntime.step() loop.
Initialization
(src/store/chat/slices/aiChat/actions/streamingExecutor.ts):
createAgentToolsEngine()GeneralChatAgent (the "brain" that decides
what to do next) and AgentRuntime (the "engine"
that executes instructions)createAgentExecutors()Execution Loop:
while (state.status !== 'done' && state.status !== 'error') {
result = await runtime.step(state, nextContext);
// GeneralChatAgent decides: call_llm → call_tool → call_llm → finish
}
At each step, GeneralChatAgent returns an
AgentInstruction based on current state,
and AgentRuntime executes it via the corresponding
executor:
call_llm: Call the LLM (see steps 3-5 below)call_tool: Execute tool calls (see step 6 below)finish: End the loopcompress_context: Context compressionrequest_human_approve / request_human_prompt /
request_human_select: Request user interventionWhen the Agent issues a call_llm instruction,
the executor calls ChatService:
src/services/chat/index.ts preprocesses messages,
tools, and parameterssrc/services/chat/mecha/ perform
context engineering, including agent config resolution,
model parameter resolution, MCP context injection, etc.getChatCompletion to prepare request parametersfetchSSE from the @lobechat/fetch-sse package
to send the request to the backend APIsrc/app/(backend)/webapi/chat/[provider]/route.ts
receives the requestinitModelRuntimeFromDB to read user's provider
config from the database and initialize ModelRuntimesrc/server/routers/lambda/aiChat.ts
also exists for server-side message sending
and structured output scenariosModelRuntime
(packages/model-runtime/src/core/ModelRuntime.ts)
calls the respective model provider's API
and returns a streaming responsefetchSSE and
fetchEventSourceWhen the AI model returns a tool_calls field in its
response, the Agent issues a call_tool instruction.
LobeHub supports three types of tools:
Builtin Tools: Tools built into the application, executed directly via local executors.
invokeBuiltinTool methodMCP Tools: External tools connected via Model Context Protocol.
MCPService (src/services/mcp.ts)
via the invokeMCPTypePlugin methodPlugin Tools: Legacy plugin system, invoked via API gateway. This system is expected to be gradually deprecated in favor of the MCP tool system.
invokeBuiltinTool methodAfter tool execution completes, results are written to
the message and returned to the Agent loop.
The Agent then calls the LLM again to generate
the final response based on tool results.
The tool dispatch logic is in
src/store/chat/slices/plugin/actions/pluginTypes.ts.
Preset tasks are predefined functions typically triggered
when users perform specific actions
(bypassing the Agent Runtime loop, calling LLM directly).
These tasks use the fetchPresetTaskResult method,
which is similar to the normal chat flow
but uses specially designed prompt chains.
Execution Timing:
autoPickEmoji method)autocompleteAgentDescription method)autocompleteAgentTags method)autocompleteAgentTitle method)translateMessage method)fetchPresetTaskResultCode Examples:
Agent avatar auto-generation implementation:
// src/features/AgentSetting/store/action.ts
autoPickEmoji: async () => {
const { config, meta, dispatchMeta } = get();
const systemRole = config.systemRole;
chatService.fetchPresetTaskResult({
onFinish: async (emoji) => {
dispatchMeta({ type: 'update', value: { avatar: emoji } });
},
onLoadingChange: (loading) => {
get().updateLoadingState('avatar', loading);
},
params: merge(
get().internal_getSystemAgentForMeta(),
chainPickEmoji(
[meta.title, meta.description, systemRole]
.filter(Boolean)
.join(','),
),
),
trace: get().getCurrentTracePayload({
traceName: TraceNameMap.EmojiPicker,
}),
});
};
Translation feature implementation:
// src/store/chat/slices/translate/action.ts
translateMessage: async (id, targetLang) => {
// ...omitted code...
// Detect language
chatService.fetchPresetTaskResult({
onFinish: async (data) => {
if (data && supportLocales.includes(data)) from = data;
await updateMessageTranslate(id, {
content,
from,
to: targetLang,
});
},
params: merge(
translationSetting,
chainLangDetect(message.content),
),
trace: get().getCurrentTracePayload({
traceName: TraceNameMap.LanguageDetect,
}),
});
// Perform translation
chatService.fetchPresetTaskResult({
onMessageHandle: (chunk) => {
if (chunk.type === 'text') {
content = chunk.text;
internal_dispatchMessage({
id,
type: 'updateMessageTranslate',
value: { content, from, to: targetLang },
});
}
},
onFinish: async () => {
await updateMessageTranslate(id, {
content,
from,
to: targetLang,
});
internal_toggleChatLoading(
false,
id,
n('translateMessage(end)', { id }) as string,
);
},
params: merge(
translationSetting,
chainTranslate(message.content, targetLang),
),
trace: get().getCurrentTracePayload({
traceName: TraceNameMap.Translation,
}),
});
};
When the Agent issues a finish instruction,
the loop ends, and the onFinish callback is called
with the complete response result.
The Agent Runtime loop execution location depends on the scenario:
internal_execAgentRuntime()
(src/store/chat/slices/aiChat/actions/streamingExecutor.ts)AgentRuntimeService.executeStep()
(src/server/services/agentRuntime/AgentRuntimeService.ts),
tRPC route at src/server/routers/lambda/aiAgent.tsModel Runtime (packages/model-runtime/) is the core
abstraction layer in LobeHub for interacting with
LLM model providers, adapting different provider APIs
into a unified interface.
Core Responsibilities:
LobeRuntimeAI
interface
(packages/model-runtime/src/core/BaseAI.ts)packages/model-runtime/src/runtimeMap.ts)chat (streaming chat),
models (model listing), embeddings (text embeddings),
createImage (image generation),
textToSpeech (speech synthesis),
generateObject (structured output)Core Interface:
// packages/model-runtime/src/core/BaseAI.ts
export interface LobeRuntimeAI {
baseURL?: string;
chat?(
payload: ChatStreamPayload,
options?: ChatMethodOptions,
): Promise<Response>;
generateObject?(
payload: GenerateObjectPayload,
options?: GenerateObjectOptions,
): Promise<any>;
embeddings?(
payload: EmbeddingsPayload,
options?: EmbeddingsOptions,
): Promise<Embeddings[]>;
models?(): Promise<any>;
createImage?: (
payload: CreateImagePayload,
) => Promise<CreateImageResponse>;
textToSpeech?: (
payload: TextToSpeechPayload,
options?: TextToSpeechOptions,
) => Promise<ArrayBuffer>;
}
Adapter Architecture: Through two factory functions —
openaiCompatibleFactory and
anthropicCompatibleFactory — most providers can be
integrated with minimal configuration. Currently supports
over 40 model providers (OpenAI, Anthropic, Google,
Azure, Bedrock, Ollama, etc.), with implementations
in packages/model-runtime/src/providers/.
Agent Runtime (packages/agent-runtime/) is LobeHub's
agent orchestration engine. As described above, it is the
core execution engine that drives the entire chat flow.
Core Components:
AgentRuntime
(packages/agent-runtime/src/core/runtime.ts):
The "engine" that executes the Agent instruction loop,
supporting call_llm, call_tool, finish,
compress_context, request_human_*, etc.GeneralChatAgent
(packages/agent-runtime/src/agents/GeneralChatAgent.ts):
The "brain" that decides which instruction to execute
next based on current stateGroupOrchestrationRuntime
(packages/agent-runtime/src/groupOrchestration/):
Multi-agent orchestration supporting speak / broadcast /
delegate / executeTask collaboration modesUsageCounter: Token usage and cost trackingInterventionChecker:
Security blacklist for managing agent behavior boundaries