ZhiPu AI

ZhiPu AI is a platform to provide model service including text generation, text embedding, image generation and so on. You can refer to ZhiPu AI Open Platform for more details. LangChain4j integrates with ZhiPu AI by using HTTP endpoint. We are consider migrating it from HTTP endpoint to official SDK and are appreciated of any help!

Maven Dependency

You can use ZhiPu AI with LangChain4j in plain Java or Spring Boot applications.

Plain Java

:::note Since 1.0.0-alpha1, langchain4j-zhipu-ai has migrated to langchain4j-community and is renamed to langchain4j-community-zhipu-ai :::

Before 1.0.0-alpha1:

xml


<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-zhipu-ai</artifactId>
    <version>${previous version here}</version>
</dependency>

1.0.0-alpha1 and later:

xml


<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-community-zhipu-ai</artifactId>
    <version>${latest version here}</version>
</dependency>

Or, you can use BOM to manage dependencies consistently:

xml


<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-community-bom</artifactId>
            <version>${latest version here}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

Configurable parameters

`ZhipuAiChatModel`

ZhipuAiChatModel has following parameters to configure when you initialize it:

Property	Description	Default Value
baseUrl	The URL to connect to. You can use HTTP or websocket to connect to DashScope	https://open.bigmodel.cn/
apiKey	The API Key
model	The model to use.	glm-4-flash
topP	The probability threshold for kernel sampling controls the diversity of texts generated by the model. the higher the `top_p`, the more diverse the generated texts, and vice versa. Value range: (0, 1.0]. We generally recommend altering this or temperature but not both.
maxRetries	The maximum retry times to request	3
temperature	Sampling temperature that controls the diversity of the text generated by the model. the higher the temperature, the more diverse the generated text, and vice versa. Value range: [0, 2)	0.7
stops	With the stop parameter, the model will automatically stop generating text when it is about to contain the specified string or token_id.
maxToken	The maximum number of tokens returned by this request.	512
listeners	Listeners that listen for request, response and errors.
callTimeout	OKHttp timeout config for request
connectTimeout	OKHttp timeout config for request
writeTimeout	OKHttp timeout config for request
readTimeout	OKHttp timeout config for request
logRequests	Whether to log request or not	false
logResponses	Whether to log response or not	false
doSample	Whether to use sampling. When set to `false`, the model will use greedy decoding
toolStream	Whether to enable partial tool streaming. When set to `true`, tool calls can be streamed incrementally	false

`ZhipuAiChatRequestParameters`

ZhipuAiChatRequestParameters can be used to configure additional parameters when sending a chat request:

Property	Description	Default Value
doSample	Whether to use sampling. When set to `false`, the model will use greedy decoding
toolStream	Whether to enable partial tool streaming. When set to `true`, tool calls can be streamed incrementally	false
thinking	Configuration for reasoning mode. `type` specifies the reasoning type, `clearThinking` controls whether to show the internal thinking process in the response

`ZhipuAiStreamingChatModel`

Same as ZhipuAiChatModel, except maxRetries.

Examples

Plain Java

You can initialize ZhipuAiChatModel by using following code:

java

ChatModel model = ZhipuAiChatModel.builder()
        .apiKey("You API key here")
        .callTimeout(Duration.ofSeconds(60))
        .connectTimeout(Duration.ofSeconds(60))
        .writeTimeout(Duration.ofSeconds(60))
        .readTimeout(Duration.ofSeconds(60))
        .build();

Or more custom for other parameters:

java

ChatModel model = ZhipuAiChatModel.builder()
        .apiKey("You API key here")
        .model("glm-4")
        .temperature(0.6)
        .maxToken(1024)
        .maxRetries(2)
        .callTimeout(Duration.ofSeconds(60))
        .connectTimeout(Duration.ofSeconds(60))
        .writeTimeout(Duration.ofSeconds(60))
        .readTimeout(Duration.ofSeconds(60))
        .build();

Reasoning

You can enable reasoning mode to get the model's internal thinking process:

java

ChatModel model = ZhipuAiChatModel.builder()
        .apiKey("You API key here")
        .model(ChatCompletionModel.GLM_4_7)  // Use GLM-4-5 or upper model for reasoning support
        .build();

ChatResponse response = model.chat(
        ChatRequest.builder()
                .messages(UserMessage.from("What is the capital of Germany?"))
                .parameters(ZhipuAiChatRequestParameters.builder()
                        .thinking(Thinking.builder()
                                .type("reasoning")
                                .clearThinking(true)
                                .build())
                        .build())
                .build());

AiMessage aiMessage = response.aiMessage();
System.out.println("Answer: "+aiMessage.text());
System.out.println("Thinking: "+aiMessage.thinking());

Partial Tool Call (Streaming)

You can stream partial tool calls incrementally using toolStream:

java

ZhipuAiStreamingChatModel model = ZhipuAiStreamingChatModel.builder()
        .apiKey("You API key here")
        .model(ChatCompletionModel.GLM_4_7)
        .build();

ToolSpecification calculator = ToolSpecification.builder()
        .name("calculator")
        .description("returns a sum of two numbers")
        .parameters(JsonObjectSchema.builder()
                .addIntegerProperty("first")
                .addIntegerProperty("second")
                .build())
        .build();

TestStreamingChatResponseHandler handler = new TestStreamingChatResponseHandler() {
    @Override
    public void onPartialToolCall(ToolExecutionRequest partialToolCall) {
        System.out.println("Partial tool call: " + partialToolCall.name() + " - " + partialToolCall.arguments());
    }
};

model.chat(
        ChatRequest.builder()
                .messages(UserMessage.from("2+2=?"))
                .parameters(ZhipuAiChatRequestParameters.builder()
                        .toolSpecifications(calculator)
                        .toolStream(true)
                        .build())
                .build(),
        handler);

More Examples

You can check more examples in: