docs/docs/integrations/language-models/mistral-ai.md
To install langchain4j to your project, add the following dependency:
For Maven project pom.xml
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<version>1.15.1</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-mistral-ai</artifactId>
<version>1.15.1</version>
</dependency>
For Gradle project build.gradle
implementation 'dev.langchain4j:langchain4j:1.15.1'
implementation 'dev.langchain4j:langchain4j-mistral-ai:1.15.1'
Add your MistralAI API key to your project, you can create a class ApiKeys.java with the following code
public class ApiKeys {
public static final String MISTRALAI_API_KEY = System.getenv("MISTRAL_AI_API_KEY");
}
Don't forget set your API key as an environment variable.
export MISTRAL_AI_API_KEY=your-api-key #For Unix OS based
SET MISTRAL_AI_API_KEY=your-api-key #For Windows OS
More details on how to get your MistralAI API key can be found here
You can use MistralAiChatModelName and MistralAiFimModelName java enums to found appropriate model names for your use case.
MistralAI updated a new selection and classification of models according to performance and cost trade-offs.
| Model name | Deployment or available from | Description |
|---|---|---|
| open-mistral-7b | - Mistral AI La Plateforme. |
Max tokens 32K
Java Enum
MistralAiChatModelName.OPEN_MISTRAL_7B |
| open-mixtral-8x7b | - Mistral AI La Plateforme.
Max tokens 32K
Java Enum
MistralAiChatModelName.OPEN_MIXTRAL_8x7B |
| open-mixtral-8x22b | - Mistral AI La Plateforme.
Max tokens 64K.
Java Enum
MistralAiChatModelName.OPEN_MIXTRAL_8X22B |
| open-mistral-nemo | - Mistral AI La Plateforme.
Max tokens 128K.
Java Enum
MistralAiChatModelName.OPEN_MISTRAL_NEMO |
| open-codestral-mamba | - Mistral AI La Plateforme.
Max tokens 256K.
Java Enum
MistralAiFimModelName.OPEN_CODESTRAL_MAMBA |
| mistral-small-latest | - Mistral AI La Plateforme.
Max tokens 32K
Java Enum
MistralAiChatModelName.MISTRAL_SMALL_LATEST |
| mistral-medium-latest | - Mistral AI La Plateforme.
Max tokens 32K
Java Enum
MistralAiChatModelName.MISTRAL_MEDIUM_LATEST |
| mistral-large-latest | - Mistral AI La Plateforme.
Max tokens 128K
Java Enum
MistralAiChatModelName.MISTRAL_LARGE_LATEST |
| mistral-embed | - Mistral AI La Plateforme.
Max tokens 8K
Java Enum
MistralAiEmbeddingModelName.MISTRAL_EMBED |
| codestral-latest | - Mistral AI La Plateforme.
Max tokens 32K
Java Enum
MistralAiFimModelName.CODESTRAL_LATEST |
@Deprecated models:
@Deprecated)@Deprecated)@Deprecated)You can find more detail and types of use cases with their respective Mistral model here
The chat models allow you to generate human-like responses with a model fined-tuned on conversational data.
Create a class and add the following code.
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.mistralai.MistralAiChatModel;
public class HelloWorld {
public static void main(String[] args) {
ChatModel model = MistralAiChatModel.builder()
.apiKey(ApiKeys.MISTRALAI_API_KEY)
.modelName(MistralAiChatModelName.MISTRAL_SMALL_LATEST)
.build();
String response = model.chat("Say 'Hello World'");
System.out.println(response);
}
}
Running the program will generate a variant of the following output
Hello World! How can I assist you today?
Create a class and add the following code.
import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
import dev.langchain4j.model.mistralai.MistralAiStreamingChatModel;
import dev.langchain4j.model.output.Response;
import java.util.concurrent.CompletableFuture;
public class HelloWorld {
public static void main(String[] args) {
MistralAiStreamingChatModel model = MistralAiStreamingChatModel.builder()
.apiKey(ApiKeys.MISTRALAI_API_KEY)
.modelName(MistralAiChatModelName.MISTRAL_SMALL_LATEST)
.build();
CompletableFuture<ChatResponse> futureResponse = new CompletableFuture<>();
model.chat("Tell me a joke about Java", new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
System.out.print(partialResponse);
}
@Override
public void onCompleteResponse(ChatResponse completeResponse) {
futureResponse.complete(completeResponse);
}
@Override
public void onError(Throwable error) {
futureResponse.completeExceptionally(error);
}
});
futureResponse.join();
}
}
You will receive each chunk of text (token) as it is generated by the LLM on the onPartialResponse method.
You can see that output below is streamed in real-time.
"Why do Java developers wear glasses? Because they can't C#"
Of course, you can combine MistralAI chat completion with other features like Set Model Parameters and Chat Memory to get more accurate responses.
In Chat Memory you will learn how to pass along your chat history, so the LLM knows what has been said before. If you don't pass the chat history, like in this simple example, the LLM will not know what has been said before, so it won't be able to correctly answer the second question ('What did I just ask?').
A lot of parameters are set behind the scenes, such as timeout, model type and model parameters. In Set Model Parameters you will learn how to set these parameters explicitly.
Function calling allows Mistral chat models (synchronous and streaming) to connect to external tools. For example, you can call a Tool to get the payment transaction status as shown in the Mistral AI function calling tutorial.
:::note Currently, function calling is available for the following models:
MistralAiChatModelName.MISTRAL_SMALL_LATESTMistralAiChatModelName.MISTRAL_LARGE_LATESTMistralAiChatModelName.OPEN_MIXTRAL_8X22BMistralAiChatModelName.OPEN_MISTRAL_NEMO
:::Tool class and how get the payment dataLet's assume you have a dataset of payment transaction like this. In real applications you should inject a database source or REST API client to get the data.
import java.util.*;
public class PaymentTransactionTool {
private final Map<String, List<String>> paymentData = Map.of(
"transaction_id", List.of("T1001", "T1002", "T1003", "T1004", "T1005"),
"customer_id", List.of("C001", "C002", "C003", "C002", "C001"),
"payment_amount", List.of("125.50", "89.99", "120.00", "54.30", "210.20"),
"payment_date", List.of("2021.15.15", "2021.15.16", "2021.15.17", "2021.15.15", "2021.15.18"),
"payment_status", List.of("Paid", "Unpaid", "Paid", "Paid", "Pending"));
...
}
Next, let's define two methods retrievePaymentStatus and retrievePaymentDate to get the payment status and payment date from the Tool class.
// Tool to be executed to get payment status
@Tool("Get payment status of a transaction") // function description
String retrievePaymentStatus(@P("Transaction id to search payment data") String transactionId) {
return getPaymentData(transactionId, "payment_status");
}
// Tool to be executed to get payment date
@Tool("Get payment date of a transaction") // function description
String retrievePaymentDate(@P("Transaction id to search payment data") String transactionId) {
return getPaymentData(transactionId, "payment_date");
}
private String getPaymentData(String transactionId, String data) {
List<String> transactionIds = paymentData.get("transaction_id");
List<String> paymentData = paymentData.get(data);
int index = transactionIds.indexOf(transactionId);
if (index != -1) {
return paymentData.get(index);
} else {
return "Transaction ID not found";
}
}
It uses a @Tool annotation to define the function description and @P annotation to define the parameter description of the package dev.langchain4j.agent.tool.*. More info here
agent to send chat messages.Create an interface PaymentTransactionAgent.
import dev.langchain4j.service.SystemMessage;
interface PaymentTransactionAgent {
@SystemMessage({
"You are a payment transaction support agent.",
"You MUST use the payment transaction tool to search the payment transaction data.",
"If there a date convert it in a human readable format."
})
String chat(String userMessage);
}
main application class to chat with the MistralAI chat modelimport dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.mistralai.MistralAiChatModel;
import dev.langchain4j.model.mistralai.MistralAiChatModelName;
import dev.langchain4j.service.AiServices;
public class PaymentDataAssistantApp {
ChatModel mistralAiModel = MistralAiChatModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY")) // Please use your own Mistral AI API key
.modelName(MistralAiChatModelName.MISTRAL_LARGE_LATEST) // Also you can use MistralAiChatModelName.OPEN_MIXTRAL_8X22B as open source model
.logRequests(true)
.logResponses(true)
.build();
public static void main(String[] args) {
// STEP 1: User specify tools and query
PaymentTransactionTool paymentTool = new PaymentTransactionTool();
String userMessage = "What is the status and the payment date of transaction T1005?";
// STEP 2: User asks the agent and AiServices call to the functions
PaymentTransactionAgent agent = AiServices.builder(PaymentTransactionAgent.class)
.chatModel(mistralAiModel)
.tools(paymentTool)
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.build();
// STEP 3: User gets the final response from the agent
String answer = agent.chat(userMessage);
System.out.println(answer);
}
}
and expect an answer like this:
The status of transaction T1005 is Pending. The payment date is October 8, 2021.
You can also use the JSON mode to get the response in JSON format. To do this, you need to set the responseFormat parameter to ResponseFormat.JSON in the MistralAiChatModel builder OR MistralAiStreamingChatModel builder.
Synchronous example:
ChatModel model = MistralAiChatModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY")) // Please use your own Mistral AI API key
.responseFormat(ResponseFormat.JSON)
.build();
String userMessage = "Return JSON with two fields: transactionId and status with the values T123 and paid.";
String json = model.chat(userMessage);
System.out.println(json); // {"transactionId":"T123","status":"paid"}
Streaming example:
StreamingChatModel streamingModel = MistralAiStreamingChatModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY")) // Please use your own Mistral AI API key
.responseFormat(MistralAiResponseFormatType.JSON_OBJECT)
.build();
String userMessage = "Return JSON with two fields: transactionId and status with the values T123 and paid.";
CompletableFuture<ChatResponse> futureResponse = new CompletableFuture<>();
streamingModel.chat(userMessage, new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
System.out.print(partialResponse);
}
@Override
public void onCompleteResponse(ChatResponse completeResponse) {
futureResponse.complete(completeResponse);
}
@Override
public void onError(Throwable error) {
futureResponse.completeExceptionally(error);
}
});
String json = futureResponse.get().content().text();
System.out.println(json); // {"transactionId":"T123","status":"paid"}
Structured Outputs ensure that a model's responses adhere to a JSON schema.
:::note The documentation for using Structured Outputs in LangChain4j is available here, and in the section below you will find MistralAI-specific information. :::
If desired, the model may be configured with a default JSON Schema that will be used as fallback if no schema is provided in the request.
ChatModel model = MistralAiChatModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY"))
.modelName(MISTRAL_SMALL_LATEST)
.supportedCapabilities(Set.of(Capability.RESPONSE_FORMAT_JSON_SCHEMA)) // Enable structured outputs
.responseFormat(ResponseFormat.builder() // Set the fallback JSON Schema (optional)
.type(ResponseFormatType.JSON)
.jsonSchema(JsonSchema.builder().rootElement(JsonObjectSchema.builder()
.addProperty("name", JsonStringSchema.builder().build())
.addProperty("capital", JsonStringSchema.builder().build())
.addProperty(
"languages",
JsonArraySchema.builder()
.items(JsonStringSchema.builder().build())
.build())
.required("name", "capital", "languages")
.build())
.build())
.build())
.strictJsonSchema(true)
.build();
Guardrails are a way to limit the behavior of the model to prevent it from generating harmful or unwanted content. You can set optionally safePrompt parameter in the MistralAiChatModel builder or MistralAiStreamingChatModel builder.
Synchronous example:
ChatModel model = MistralAiChatModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY"))
.safePrompt(true)
.build();
String userMessage = "What is the best French cheese?";
String response = model.chat(userMessage);
Streaming example:
StreamingChatModel streamingModel = MistralAiStreamingChatModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY"))
.safePrompt(true)
.build();
String userMessage = "What is the best French cheese?";
CompletableFuture<ChatResponse> futureResponse = new CompletableFuture<>();
streamingModel.chat(userMessage, new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String partialResponse) {
System.out.print(partialResponse);
}
@Override
public void onCompleteResponse(ChatResponse completeResponse) {
futureResponse.complete(completeResponse);
}
@Override
public void onError(Throwable error) {
futureResponse.completeExceptionally(error);
}
});
futureResponse.join();
Toggling the safe prompt will prepend your messages with the following @SystemMessage:
Always assist with care, respect, and truth. Respond with utmost utility yet securely. Avoid harmful, unethical, prejudiced, or negative content. Ensure replies promote fairness and positivity.
Both MistralAiChatModel and MistralAiStreamingChatModel support
reasoning with Magistral reasoning models.
Configured with the following parameters:
returnThinking: when enabled, reasoning text produced by the model will be parsed from the API response
and stored in AiMessage.thinking(). For streaming, StreamingChatResponseHandler.onPartialThinking()
and TokenStream.onPartialThinking() callbacks will also be invoked.
Disabled by default.sendThinking: when enabled, reasoning text from previous responses (stored in AiMessage.thinking())
will be included in follow-up requests to the LLM.
Disabled by default.Here is an example of how to configure reasoning:
ChatModel model = MistralAiChatModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY"))
.modelName(MistralAiChatModelName.MAGISTRAL_MEDIUM_LATEST)
.returnThinking(true)
.sendThinking(true)
.build();
It is a classifier model that can be used to detect harmful content in text.
Moderation example:
ModerationModel model = new MistralAiModerationModel.Builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY"))
.modelName(MistralAiModerationModelName.MISTRAL_MODERATION_LATEST)
.logRequests(true)
.logResponses(false)
.build();
// I want to check if the text contains harmful content
Moderation moderation = model.moderate("I want to kill them.").content();
The Fill-in-the-Middle (FIM) models allow you to generate code completions, user can define the starting point of the code using a prompt, and the ending point of the code using an optional suffix and an optional stop.
Just like how chat completions work, the FIM endpoint works as well. You can test it by adding the following code.
import dev.langchain4j.model.mistralai.MistralAiFimModel;
import dev.langchain4j.model.output.Response;
public class HelloWorld {
public static void main(String[] args) {
MistralAiFimModel codestral = MistralAiFimModel.builder()
.apiKey(System.getenv("MISTRAL_AI_API_KEY"))
.modelName(MistralAiFimModelName.CODESTRAL_LATEST)
.stop(List.of("}")) // must stop at the first occurrence of "}"
.build();
// I want to generate a code completion for a simple hello world program using MistralAI of LangChain4j framework.
String codePrompt = """
public static void main(String[] args) {
// Create a function to multiply two numbers
""";
String suffix = """
System.out.println(result);
}
""";
// Asking to Codestral model to complete the code with given prompt and suffix
Response<String> response = codestral.generate(prompt, suffix);
System.out.println(
String.format(
"%s%s%s",
prompt, // print code prompt (prefix)
response.content(), // print code filled-in-the-middle
suffix)); // print code suffix
}
}
Running the program will print of the following output
public static void main(String[] args) {
// Create a function to multiply two numbers
int result = multiply(5, 3);
System.out.println(result);
}
Create a class and add the following code.
import dev.langchain4j.model.StreamingResponseHandler;
import dev.langchain4j.model.language.StreamingLanguageModel;
import dev.langchain4j.model.mistralai.MistralAiStreamingFimModel;
import dev.langchain4j.model.output.Response;
import java.util.concurrent.CompletableFuture;
public class HelloWorld {
public static void main(String[] args) {
StreamingLanguageModel codestralStream = MistralAiStreamingFimModel.builder()
.apiKey(ApiKeys.MISTRALAI_API_KEY)
.modelName(MistralAiFimModelName.CODESTRAL_LATEST)
.build();
// I want to generate a code completion for a simple hello world program.
String prompt = "public static void main(String[] args) {";
CompletableFuture<Response<String>> futureResponse = new CompletableFuture<>();
codestral.generate(prompt, new StreamingResponseHandler() {
@Override
public void onNext(String token) {
System.out.print(token);
}
@Override
public void onComplete(Response<String> response) {
futureResponse.complete(response);
}
@Override
public void onError(Throwable error) {
futureResponse.completeExceptionally(error);
}
});
futureResponse.join();
}
}
You will receive each chunk of text (token) as it is generated by the LLM on the onNext method.
You can see that output below is streamed in real-time.
public static void main(String[] args) {
int[] arr = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int sum = 0;
for (int i = 0; i < arr.length; i++) {
sum += arr[i];
}
System.out.println("Sum of all elements in the array: " + sum);
}
}
When using MistralAiChatModel, you can access the raw HTTP response:
SuccessfulHttpResponse rawHttpResponse = ((MistralAiChatResponseMetadata) chatResponse.metadata()).rawHttpResponse();
System.out.println(rawHttpResponse.body());
System.out.println(rawHttpResponse.headers());
System.out.println(rawHttpResponse.statusCode());
When using MistralAiStreamingChatModel, you can access the raw HTTP response (see above) and raw Server-Sent Events:
List<ServerSentEvent> rawServerSentEvents = ((MistralAiChatResponseMetadata) chatResponse.metadata()).rawServerSentEvents();
System.out.println(rawServerSentEvents.get(0).data());
System.out.println(rawServerSentEvents.get(0).event());