docs/docs/tutorials/guardrails.md
import useBaseUrl from '@docusaurus/useBaseUrl'; import ThemedImage from '@theme/ThemedImage'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';
:::note Guardrails is an experimental feature. Its API and behavior might change in future versions. :::
Guardrails are mechanisms that let you validate the input and output of the LLM to ensure it meets your expectations. You can do some of the following things with guardrails:
Those are just examples. You can do many other things with guardrails.
:::note
Guardrails are only available when using AI Services. They are a higher-level construct that can not be applied to a ChatModel or StreamingChatModel.
:::
<ThemedImage alt="Guardrails" sources={{ light: useBaseUrl('/img/guardrails-light-bg.png'), dark: useBaseUrl('/img/guardrails-dark-bg.png'), }} />;
The implementation was originally done in the Quarkus LangChain4j extension and was backported here.
Ideally, guardrail implementations should follow the single responsibility principle, meaning that each guardrail class should validate one thing. Then, chain guardrails together to guard against multiple things.
The order of guardrails in the chain is important. The first guardrail in the chain to fail will trigger the overall failure. Ensure guardrails that catch the most failures are early in the chain, whereas more specific guardrails that may fail very infrequently are towards the end of the chain.
Also keep in mind that guardrails can themselves call other services or even invoke other LLM interactions. If these kinds of guardrails have an execution penalty or monetary cost associated with them, make sure you take that into account. You might want to put more expensive guardrails towards the end of the chain.
:::note The term expensive can mean that something takes some time to execute or has a monetary value associated with it. :::
Input guardrails are functions invoked before the LLM is called. Failing an input guardrail prevents the LLM from being called. Input guardrails are the last step prior to calling the LLM. They are invoked after any RAG operations have happened.
Input guardrails are implemented by implementing the InputGuardrail interface. The InputGuardrail interface has two variants of the validate method, at least one of which needs to be implemented:
InputGuardrailResult validate(UserMessage userMessage);
InputGuardrailResult validate(InputGuardrailRequest params);
The first variant is used for simple guardrails, or when the guardrail only needs access to the UserMessage.
The second variant is for more complex guardrails that need more information, such as the chat memory/history, user message template, augmentation results, or variables that were passed to the template. See InputGuardrailRequest for more information.
Some examples of things you could do:
Input guardrails can be used whether the operation is synchronous or asynchronous/streaming.
Input guardrails can have the following outcomes. There are helper methods on the InputGuardrail interface that can provide the outcomes:
| Outcome | Helper method on InputGuardrail | Description |
|---|---|---|
| success | success() | - The input is valid. |
successWith(String) | Similar to success except the user message is altered before proceeding to the next step (next guardrail in the chain or calling the LLM). |
| failure | failure(String) or failure(String, Throwable) | - The input is invalid but the next guardrails in the chain continue to be executed in order to accumulate all possible validation problems.Throwable is passed, consumers can catch InputGuardrailException and check the cause. It will be the Throwable passed here. |
| fatal | fatal(String) or fatal(String, Throwable) | - The input is invalid and execution is halted with an InputGuardrailException.Throwable is passed, consumers can catch InputGuardrailException and check the cause. It will be the Throwable passed here. |There are several ways to declare input guardrails, listed here in order of precedence:
InputGuardrail implementation class names or instances set directly on the AiServices builder.@InputGuardrails annotations placed on an individual AI Service method.@InputGuardrails annotation placed on an AI Service class.
Regardless of how they are declared, input guardrails are always executed in the order they appear in the list.AiServices builderInputGuardrail implementation class names or instances set directly on the AiServices builder have the highest precedence, meaning if it is declared in any other ways, the one declared directly on the builder will be the one used.
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.inputGuardrailClasses(FirstInputGuardrail.class, SecondInputGuardrail.class)
.build();
or
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.inputGuardrails(new FirstInputGuardrail(), new SecondInputGuardrail())
.build();
:::info If you want a ready-made experimental input guardrail that rewrites eligible single-text input using prompt repetition, see the community Prompt Repetition module. :::
In the first scenario, classes that implement InputGuardrail are passed. New instances of these classes are created dynamically using reflection.
:::info The way classes are converted to instances can be customized. For example, frameworks that use dependency injection (like Quarkus or Spring) can use extension points to provide instances based on how they manage class instances rather than creating new instances via reflection each time. :::
@InputGuardrails annotations placed on an individual AI Service methods have the next highest precedence.
public interface Assistant {
@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.create(Assistant.class, chatModel);
In this example, only the chat method has guardrails.
chat method, FirstInputGuardrail is invoked first.SecondInputGuardrail will only be invoked if FirstInputGuardrail does not result in a fatal result.FirstInputGuardrail or SecondInputGuardrail could re-write the user message.FirstInputGuardrail re-writes the user message, then SecondInputGuardrail will receive the new user message as input.The doSomethingElse method does not have any guardrails.
@InputGuardrails annotation placed on an AI Service class has the lowest precedence.
@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.create(Assistant.class, chatModel);
In this example, both the chat and doSomethingElse methods have the guardrails.
FirstInputGuardrail is invoked first.SecondInputGuardrail will only be invoked if FirstInputGuardrail does not result in a fatal result.FirstInputGuardrail or SecondInputGuardrail could re-write the user message.FirstInputGuardrail re-writes the user message, then SecondInputGuardrail will receive the new user message as input.There are some unit testing utilities based on AssertJ in the langchain4j-test module.
Once you have the dependency, you can perform these kinds of validations:
import static dev.langchain4j.test.guardrail.GuardrailAssertions.assertThat;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.guardrail.GuardrailResult.Result;
class Tests {
MyInputGuardrail inputGuardrail = new MyInputGuardrail();
@Test
void test() {
var userMessage = UserMessage.from("Some user message");
var result = inputGuardrail.validate(userMessage);
// These are just some examples of what you can do
assertThat(result)
.isSuccessful()
.hasResult(Result.FATAL)
.hasFailures()
.hasSingleFailureWithMessage("Prompt injection detected")
.assertSingleFailureSatisfied(failure -> assertThat(failure)...)
.withFailures().....
}
}
:::info
See the GuardrailAssertions and InputGuardrailResultAssert classes for more details.
:::
There are several common use cases where implementations of an input guardrail are provided by LangChain4j:
| Guardrail class | Description |
|---|---|
MessageModeratorInputGuardrail | An input guardrail that validates user messages using a ModerationModel to detect potentially harmful, inappropriate, or policy-violating content. |
Output guardrails are functions executed after the LLM has produced its output. Failing an output guardrail allows for more advanced scenarios, such as retrying or reprompting, to help improve the response. They are invoked after all other operations, including function/tool calls, have happened.
Similar to input guardrails, output guardrails are implemented by implementing the OutputGuardrail interface. The OutputGuardrail interface has two variants of the validate method, at least one of which needs to be implemented:
OutputGuardrailResult validate(AiMessage responseFromLLM);
OutputGuardrailResult validate(OutputGuardrailRequest params);
The first variant is used for simple guardrails, or when the guardrail only needs access to the resulting AiMessage.
The second variant is for more complex guardrails that need more information, such as the entire chat response, chat memory/history, user message template, or variables that were passed to the template. See OutputGuardrailRequest for more information.
Some examples of things you could do:
Output guardrails can have the following outcomes. There are helper methods on the OutputGuardrail interface that can provide the outcomes:
| Outcome | Helper method on OutputGuardrail | Description |
|---|---|---|
| success | success() | - The output is valid. |
successWith(String) or successWith(String, Object) | -Similar to success except the output isn't valid in its original form and has been rewritten to make it valid.failure(String) or failure(String, Throwable) | - The output is invalid but the next guardrails in the chain continue to be executed in order to accumulate all possible validation problems.OutputGuardrailException. |
| fatal | fatal(String) or fatal(String, Throwable) | The output is invalid and execution is halted with an OutputGuardrailException thrown to the caller. |
| fatal with retry | retry(String) or retry(String, Throwable) | - Similar to fatal except the LLM is called again with the same prompt and chat history as the original call.OutputGuardrailException thrown to the caller.reprompt(String, String) or reprompt(String, Throwable, String) | - Similar to fatal with retry except the LLM is called again with a new prompt supplied by the guardrail.OutputGuardrailException thrown to the caller.There are several ways to declare output guardrails, listed here in order of precedence:
OutputGuardrail implementation class names or instances set directly on the AiServices builder.@OutputGuardrails annotations placed on an individual AI Service method.@OutputGuardrails annotation placed on an AI Service class.Regardless of how they are declared, output guardrails are always executed in the order they appear in the list.
AiServices builderOutputGuardrail implementation class names or instances set directly on the AiServices builder have the highest precedence, meaning if it is declared in any other ways, the one declared on the builder will be the one used.
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.outputGuardrailClasses(FirstOutputGuardrail.class, SecondOutputGuardrail.class)
.build();
or
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.outputGuardrails(new FirstOutputGuardrail(), new SecondOutputGuardrail())
.build();
In the first scenario, classes that implement OutputGuardrail are passed. New instances of these classes are created dynamically using reflection.
:::info The way classes are converted to instances can be customized. For example, frameworks that use dependency injection (like Quarkus or Spring) can use extension points to provide instances based on how they manage class instances rather than creating new instances via reflection each time. :::
@OutputGuardrails annotations placed on ndividual AI Service methods have the next highest precendence.
public interface Assistant {
@OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.create(Assistant.class, chatModel);
In this example, only the chat method has guardrails.
chat method, FirstOutputGuardrail is invoked first.SecondOutputGuardrail will only be invoked if FirstOutputGuardrail does not result in a fatal, fatal with retry, or fatal with reprompt result.SecondOutputGuardrail will receive the output of FirstOutputGuardrail.SecondOutputGuardrail succeeds after a retry or reprompt, then both FirstOutputGuardrail and SecondOutputGuardrail are re-executed.The doSomethingElse method does not have any guardrails.
@OutputGuardrails annotation placed on an AI Service class has the lowest precedence.
@OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}
var assistant = AiServices.create(Assistant.class, chatModel);
In this example, both the chat and doSomethingElse methods have the guardrails.
FirstOutputGuardrail is invoked first.SecondOutputGuardrail will only be invoked if FirstOutputGuardrail does not result in a fatal, fatal with retry, or fatal with reprompt result.SecondOutputGuardrail will receive the output of FirstOutputGuardrail.SecondOutputGuardrail succeeds after a retry or reprompt, then both FirstOutputGuardrail and SecondOutputGuardrail are re-executed.Output guardrails have the following additional configuration that can be supplied:
| Configuration | Description |
|---|---|
maxRetries | - The maximum number of retries for an output guardrail when performing a retry or reprompt. |
2.0 to disable retries. |public interface MethodLevelAssistant {
@OutputGuardrails(
value = { FirstOutputGuardrail.class, SecondOutputGuardrail.class },
maxRetries = 10
)
String chat(String question);
}
var assistant = AiServices.create(MethodLevelAssistant.class, chatModel);
@OutputGuardrails(
value = { FirstOutputGuardrail.class, SecondOutputGuardrail.class },
maxRetries = 10
)
public interface ClassLevelAssistant {
String chat(String question);
}
var assistant = AiServices.create(ClassLevelAssistant.class, chatModel);
AiServices builderpublic interface Assistant {
String chat(String message);
}
var outputGuardrailsConfig = OutputGuardrailsConfig.builder()
.maxRetries(10)
.build();
var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.outputGuardrailsConfig(outputGuardrailsConfig)
.outputGuardrailClasss(FirstOutputGuardrail.class, SecondOutputGuardrail.class)
.build();
Output guardrails can also work for operations with streaming responses:
public interface StreamingAssistant {
@OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
TokenStream streamingChat(String message);
}
In this scenario, the output guardrails will be executed once the entire stream is complete, or more specifically, when TokenStream.onCompleteResponse is called. onPartialResponse will be buffered and replayed once the guardrails succeed.
In the situation where a retry or reprompt in the chain eventually succeeds, then the entire chain is re-executed synchronously. Each guardrail will be re-executed one after the other in the original order. Once the chain completes the result is passed into TokenStream.onCompleteResponse.
There are several common use cases where implementations of an output guardrail are provided by LangChain4j:
| Guardrail class | Description |
|---|---|
JsonExtractorOutputGuardrail | An output guardrail that will check whether or not a response can be successfully deserialized from JSON to an object of a certain type. |
protected methods that can be overridden to customize behavior). |There are some unit testing utilities based on AssertJ in the langchain4j-test module.
Once you have the dependency, you can perform these kinds of validations:
import static dev.langchain4j.test.guardrail.GuardrailAssertions.assertThat;
import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.guardrail.GuardrailResult.Result;
class Tests {
MyOutputGuardrail outputGuardrail = new MyOutputGuardrail();
@Test
void test() {
var aiMessage = AiMessage.from("Some output");
var result = outputGuardrail.validate(aiMessage);
// These are just some examples of what you can do
assertThat(result)
.isSuccessful()
.hasResult(Result.FATAL)
.hasFailures()
.hasSingleFailureWithMessage("Hallucination detected!")
.hasSingleFailureWithMessageAndReprompt("Hallucination detected!", "Please LLM don't hallucinate!")
.assertSingleFailureSatisfied(failure -> assertThat(failure)...)
.withFailures().....
}
}
:::info
See the GuardrailAssertions and OutputGuardrailResultAssert classes for more details.
:::
You can mix and match input and output guardrails however you like!
public class MyObjectJsonOutputGuardrail extends JsonExtractorOutputGuardrail<MyObject> {
public MyObjectJsonOutputGuardrail() {
super(MyObject.class);
}
}
@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
@OutputGuardrails(value = SomeOutputGuardrail.class, maxRetries = 5)
public interface Assistant {
String chat(String message);
@InputGuardrails(PromptInjectionGuardrail.class)
@OutputGuardrails(MyObjectJsonOutputGuardrail.class)
MyObject chatAndReturnJson(String message);
}
var outputGuardrailsConfig = OutputGuardrailsConfig.builder()
.maxRetries(10)
.build();
var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.inputGuardrails(new AnotherInputGuardrail())
.outputGuardrailsConfig(outputGuardrailsConfig)
.build();
In this example, all the methods on the Assistant have a single input guardrail, AnotherInputGuardrail, because it is set on the AiServices builder. Additionally, all the output guardrails have a maxRetries value == 10, because the config is also set on the AiServices builder.
The chat method has a single output guardrail, SomeOutputGuardrail, with a maxRetries value == 10.
The chatAndReturnJson method a single output guardrail, MyObjectJsonOutputGuardrail with a maxRetries value == 10.
The guardrail system was built in a composable way so it can be extended and reused in other downstream frameworks (such as Quarkus or Spring Boot). This section describes some of the extension points or "hooks" that are provided.
All of these extension points utilize the Java Service Provider Interface (Java SPI).
| Extension point interface | Purpose |
|---|---|
ClassInstanceFactory | Provides instances of classes. |
CDIClassInstanceFactoryApplicationContextClassInstanceFactory |
| ClassMetadataProviderFactory | Provides access to class metadata.AiService interfaces, and find and process the @InputGuardrails/@OutputGuardrails annotations.ReflectionBasedClassMetadataProviderFactory is the default implementation if no others are found, providing class metadata using reflection. |
| GuardrailServiceBuilderFactory | Provides builder instances for building GuardrailService instances. An application or framework would implement this if they needed to customize the way they build GuardrailService instances. |
| InputGuardrailsConfigBuilderFactory | - SPI for overriding and/or extending the default InputGuardrailsConfigBuilderOutputGuardrailsConfigBuilderFactory | - SPI for overriding and/or extending the default OutputGuardrailsConfigBuilderInputGuardrailExecutorBuilderFactory | - SPI for overriding and/or extending the default InputGuardrailExecutorBuilder responsible for building [InputGuardrailExecutor](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/main/java/dev/langchain4j/guardrail/InputGuardrailExecutor.java) instances. | | [OutputGuardrailExecutorBuilderFactory](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/main/java/dev/langchain4j/spi/guardrail/OutputGuardrailExecutorBuilderFactory.java) | - SPI for overriding and/or extending the default OutputGuardrailExecutorBuilder responsible for building OutputGuardrailExecutor instances. |