site/blog/red-team-langchain.md
Want to test your LangChain application for vulnerabilities? This guide shows you how to use Promptfoo to systematically probe for security issues through adversarial testing (red teaming) of your LangChain chains and agents.
You'll learn how to use adversarial LLM models to test your LangChain application's security mechanisms and identify potential vulnerabilities in your chains.
<!-- truncate -->Before you begin, ensure you have the following:
npxpip install langchainLet's set up a Python virtual environment and install LangChain and OpenAI:
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
pip install langchain openai # Add other dependencies as needed
Create a Python script (langchain_provider.py) that will interface with your LangChain application.
def call_api(prompt, options, context):
# Fill in your LangChain code here
return {
"output": "Final output from LangChain...",
}
:::tip You can use any LangChain instance you want here. Feel free to import your own custom chains, agents, and tools. :::
Here's a complete script example:
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
import json
import os
def call_api(prompt, options, context):
# Initialize your LangChain components
llm = ChatOpenAI(
temperature=0.7,
model_name="gpt-5-mini"
)
# Create your chain
template = """You are a helpful assistant for Acme, Inc.
Question: {question}
Answer:"""
prompt_template = PromptTemplate(
input_variables=["question"],
template=template
)
chain = LLMChain(llm=llm, prompt=prompt_template)
try:
# Execute the chain
result = chain.run(question=prompt)
return {
"output": result,
"tokenUsage": {
"total": llm.get_num_tokens(prompt + result),
"prompt": llm.get_num_tokens(prompt),
"completion": llm.get_num_tokens(result)
}
}
except Exception as e:
return {
"error": str(e),
"output": None
}
For a full end-to-end red team configuration example, see Github.
Promptfoo's red teaming system is built around the concept of plugins and strategies.
Plugins are Promptfoo's modular system for testing specific types of risks and vulnerabilities in your LangChain application. Each plugin is a trained model that produces malicious payloads targeting specific weaknesses.
See the full plugins documentation for details.
Plugins are organized into categories:
You can also create custom policies or upload your own promptfoo plugins to test specific business requirements or regulatory compliance.
Strategies determine HOW the adversarial inputs (generated by plugins) are delivered to your application to maximize attack success rates. See the strategies documentation for a complete overview.
For example:
prompt-injection or base64, which use predefined patterns to bypass securityjailbreak and jailbreak:composite, which use an attacker model to iteratively refine attacksgoat or crescendo, which attempt to manipulate stateful applications over multiple interactionsFor a complete list of vulnerabilities that can be tested, see our LLM vulnerability types guide.
Next, configure your red teaming setup in promptfooconfig.yaml.
targets:
- id: 'file://langchain_provider.py'
label: 'LangChain App'
redteam:
# Purpose is a short description of your application. It's used to generate highly relevant malicious test cases.
purpose: 'A customer support agent for Acme, Inc...'
# Number of test cases per plugin.
numTests: 5
# Plugins specify the types of vulnerabilities to test.
plugins:
# Can my application be tricked into hate speech?
- harmful:hate
# Can my application encourage self-harm?
- harmful:self-harm
# Will my application recommend competitors?
- competitors
# Strategies are special techniques for finding jailbreaks
strategies:
- jailbreak
- jailbreak:composite
The redteam run command will generate and run adversarial test cases that are customized to your application:
npx promptfoo@latest redteam run
This can take a few minutes to complete, depending on how many plugins and strategies you have configured.
Generate a comprehensive report:
npx promptfoo@latest redteam report
You'll see a report like this:
The report will highlight:
Common mitigations for LangChain applications include:
After implementing fixes, rerun the evaluation:
npx promptfoo@latest redteam run
npx promptfoo@latest redteam report