Advanced Prompt Techniques (Variable Mappings, Functions)

In this notebook we show some advanced prompt techniques. These features allow you to define more custom/expressive prompts, re-use existing ones, and also express certain operations in fewer lines of code.

We show the following features:

Partial formatting
Prompt template variable mappings
Prompt function mappings
Dynamic few-shot examples

python

%pip install llama-index-llms-openai

1. Partial Formatting

Partial formatting (partial_format) allows you to partially format a prompt, filling in some variables while leaving others to be filled in later.

This is a nice convenience function so you don't have to maintain all the required prompt variables all the way down to format, you can partially format as they come in.

This will create a copy of the prompt template.

python

from llama_index.core.prompts import RichPromptTemplate

qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ context_str }}
---------------------
Given the context information and not prior knowledge, answer the query.
Please write the answer in the style of {{ tone_name }}
Query: {{ query_str }}
Answer: \
"""

prompt_tmpl = RichPromptTemplate(qa_prompt_tmpl_str)

python

partial_prompt_tmpl = prompt_tmpl.partial_format(tone_name="Shakespeare")

python

partial_prompt_tmpl.kwargs

python

fmt_prompt = partial_prompt_tmpl.format(
    context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
    query_str="How many params does llama 2 have",
)
print(fmt_prompt)

We can also use format_messages to format the prompt into ChatMessage objects.

python

fmt_prompt = partial_prompt_tmpl.format_messages(
    context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
    query_str="How many params does llama 2 have",
)
print(fmt_prompt)

2. Prompt Template Variable Mappings

Template var mappings allow you to specify a mapping from the "expected" prompt keys (e.g. context_str and query_str for response synthesis), with the keys actually in your template.

This allows you re-use your existing string templates without having to annoyingly change out the template variables.

python

from llama_index.core.prompts import RichPromptTemplate

# NOTE: here notice we use `my_context` and `my_query` as template variables
qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ my_context }}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {{ my_query }}
Answer: \
"""

template_var_mappings = {"context_str": "my_context", "query_str": "my_query"}

prompt_tmpl = RichPromptTemplate(
    qa_prompt_tmpl_str, template_var_mappings=template_var_mappings
)

python

fmt_prompt = prompt_tmpl.format(
    context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
    query_str="How many params does llama 2 have",
)
print(fmt_prompt)

3. Prompt Function Mappings

You can also pass in functions as template variables instead of fixed values.

This allows you to dynamically inject certain values, dependent on other values, during query-time.

Here are some basic examples. We show more advanced examples (e.g. few-shot examples) in our Prompt Engineering for RAG guide.

python

from llama_index.core.prompts import RichPromptTemplate

qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ context_str }}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {{ query_str }}
Answer: \
"""


def format_context_fn(**kwargs):
    # format context with bullet points
    context_list = kwargs["context_str"].split("\n\n")
    fmtted_context = "\n\n".join([f"- {c}" for c in context_list])
    return fmtted_context


prompt_tmpl = RichPromptTemplate(
    qa_prompt_tmpl_str, function_mappings={"context_str": format_context_fn}
)

python

context_str = """\
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.

Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models.
"""

fmt_prompt = prompt_tmpl.format(
    context_str=context_str, query_str="How many params does llama 2 have"
)
print(fmt_prompt)

4. Dynamic few-shot examples

Using the function mappings, you can also dynamically inject few-shot examples based on other prompt variables.

Here's an example that uses a vector store to dynamically inject few-shot text-to-sql examples based on the query.

First, lets define a text-to-sql prompt template.

python

text_to_sql_prompt_tmpl_str = """\
You are a SQL expert. You are given a natural language query, and your job is to convert it into a SQL query.

Here are some examples of how you should convert natural language to SQL:
<examples>
{{ examples }}
</examples>

Now it's your turn.

Query: {{ query_str }}
SQL: 
"""

Given this prompt template, lets define and index some few-shot text-to-sql examples.

python

import os

os.environ["OPENAI_API_KEY"] = "sk-..."

python

from llama_index.core import Settings, VectorStoreIndex
from llama_index.core.schema import TextNode
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Set global default LLM and embed model
Settings.llm = OpenAI(model="gpt-4o-mini")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

# Setup few-shot examples
example_nodes = [
    TextNode(
        text="Query: How many params does llama 2 have?\nSQL: SELECT COUNT(*) FROM llama_2_params;"
    ),
    TextNode(
        text="Query: How many layers does llama 2 have?\nSQL: SELECT COUNT(*) FROM llama_2_layers;"
    ),
]

# Create index
index = VectorStoreIndex(nodes=example_nodes)

# Create retriever
retriever = index.as_retriever(similarity_top_k=1)

With our retriever, we can create our prompt template with function mappings to dynamically inject few-shot examples based on the query.

python

from llama_index.core.prompts import RichPromptTemplate


def get_examples_fn(**kwargs):
    query = kwargs["query_str"]
    examples = retriever.retrieve(query)
    return "\n\n".join(node.text for node in examples)


prompt_tmpl = RichPromptTemplate(
    text_to_sql_prompt_tmpl_str,
    function_mappings={"examples": get_examples_fn},
)

python

prompt = prompt_tmpl.format(
    query_str="What are the number of parameters in the llama 2 model?"
)
print(prompt)

python

response = Settings.llm.complete(prompt)
print(response.text)