docs/optimization/dynamic-in-context-learning-dicl.mdx
Dynamic In-Context Learning (DICL) is an inference-time optimization that improves LLM performance by incorporating relevant historical examples into your prompt. Instead of incorporating static examples manually in your prompts, DICL selects the most relevant examples at inference time.
Here's how it works:
DICL is particularly useful if you have limited high-quality data.
| Criterion | Impact | Details |
|---|---|---|
| Complexity | Low | Requires data curation; few parameters |
| Data Efficiency | High | Achieves good results with limited data |
| Optimization Ceiling | Moderate | Plateaus quickly with more data; prompt only but dynamic |
| Optimization Cost | Low | Generates embeddings for curated examples |
| Inference Cost | High | Scales input tokens proportional to k |
| Inference Latency | Moderate | Requires embedding and retrieval before LLM call |
DICL tends to work best when:
k (see below), degrading performance.
You can find a complete runnable example of this guide on GitHub.
</Tip> <Steps> <Step title="Configure your LLM application">Define a function with a baseline variant for your application.
[functions.extract_entities]
type = "json"
output_schema = "functions/extract_entities/output_schema.json"
[functions.extract_entities.variants.baseline]
type = "chat_completion"
model = "openai::gpt-5-mini"
templates.system.path = "functions/extract_entities/initial_prompt/system_template.minijinja"
json_mode = "strict"
If your prompt has a lot of boilerplate, configure prompt templates. DICL operates on template variables, so it'll improve retrieval (and therefore inference quality) and mitigate the marginal cost and latency. Set system_instructions in your variant configuration with the boilerplate instead.
You are an assistant that is performing a named entity recognition task.
Your job is to extract entities from a given text.
The entities you are extracting are:
- people
- organizations
- locations
- miscellaneous other entities
Please return the entities in the following JSON format:
{
"person": ["person1", "person2", ...],
"organization": ["organization1", "organization2", ...],
"location": ["location1", "location2", ...],
"miscellaneous": ["miscellaneous1", "miscellaneous2", ...]
}
After deploying the TensorZero Gateway with Postgres, build a dataset of good examples for the extract_entities function you configured.
You can create datapoints from historical inferences or external/synthetic datasets.
The performance of DICL degrades as the curated examples become noisier with bad behavior. There is a trade-off between dataset size and quality of datapoints.
</Tip> </Step> <Step title="Configure DICL">Configure DICL by specifying the name of your function, variant, and embedding model.
from tensorzero import DICLOptimizationConfig
optimization_config = DICLOptimizationConfig(
function_name="extract_entities",
variant_name="dicl",
embedding_model="openai::text-embedding-3-small",
k=10, # how many examples are retrieved and injected as context
model="openai::gpt-5-mini", # LLM that will generate outputs using the retrieved examples
)
You can also define a custom embedding model in your configuration.
<Tip>You should experiment with different choices of k.
Typical values are 3-10, with smaller values when inputs tend to be larger.
If you see inferences with irrelevant examples, consider setting a max_distance in your variant configuration later. With this setting, the retrieval step can return less than k examples if they don't meet a cosine distance threshold. Make sure to tune the value according to your embedding model.
You can now launch your DICL optimization job using the TensorZero Gateway:
job_handle = t0.experimental_launch_optimization_workflow(
function_name="extract_entities",
template_variant_name="baseline",
dataset_name="extract_entities_dataset",
optimizer_config=optimization_config,
)
job_info = t0.experimental_poll_optimization(
job_handle=job_handle
)
DICL will embed all your training samples and store them in Postgres.
</Step> <Step title="Update your configuration">After optimization completes, add the DICL variant to your configuration:
[functions.extract_entities.variants.dicl]
type = "experimental_dynamic_in_context_learning"
embedding_model = "openai::text-embedding-3-small"
k = 10
model = "openai::gpt-5-mini"
json_mode = "strict"
The embedding_model in the configuration must match the embedding model you used during optimization.
That's it!
At inference time, the DICL variant will retrieve the k most similar examples from your training data and include them as context for in-context learning.
You can run experiments comparing your baseline and DICL variants using adaptive A/B testing.
</Tip>DICLOptimizationConfigConfigure DICL optimization by creating a DICLOptimizationConfig object with the following parameters: