Query Engine with Pydantic Outputs

Every query engine has support for integrated structured responses using the following response_modes in RetrieverQueryEngine:

refine
compact
tree_summarize
accumulate (beta, requires extra parsing to convert to objects)
compact_accumulate (beta, requires extra parsing to convert to objects)

In this notebook, we walk through a small example demonstrating the usage.

Under the hood, every LLM response will be a pydantic object. If that response needs to be refined or summarized, it is converted into a JSON string for the next response. Then, the final response is returned as a pydantic object.

NOTE: This can technically work with any LLM, but non-openai is support is still in development and considered beta.

Setup

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

python

%pip install llama-index-llms-anthropic
%pip install llama-index-llms-openai

python

!pip install llama-index

python

import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]

Download Data

python

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

python

from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader("./data/paul_graham").load_data()

Create our Pydanitc Output Object

python

from typing import List
from pydantic import BaseModel


class Biography(BaseModel):
    """Data model for a biography."""

    name: str
    best_known_for: List[str]
    extra_info: str

Create the Index + Query Engine (OpenAI)

When using OpenAI, the function calling API will be leveraged for reliable structured outputs.

python

from llama_index.core import VectorStoreIndex
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

index = VectorStoreIndex.from_documents(
    documents,
)

python

query_engine = index.as_query_engine(
    output_cls=Biography, response_mode="compact", llm=llm
)

python

response = query_engine.query("Who is Paul Graham?")

python

print(response.name)
print(response.best_known_for)
print(response.extra_info)

python

# get the full pydanitc object
print(type(response.response))

Create the Index + Query Engine (Non-OpenAI, Beta)

When using an LLM that does not support function calling, we rely on the LLM to write the JSON itself, and we parse the JSON into the proper pydantic object.

python

import os

os.environ["ANTHROPIC_API_KEY"] = "sk-..."

python

from llama_index.core import VectorStoreIndex
from llama_index.llms.anthropic import Anthropic

llm = Anthropic(model="claude-instant-1.2", temperature=0.1)

index = VectorStoreIndex.from_documents(
    documents,
)

python

query_engine = index.as_query_engine(
    output_cls=Biography, response_mode="tree_summarize", llm=llm
)

python

response = query_engine.query("Who is Paul Graham?")

python

print(response.name)
print(response.best_known_for)
print(response.extra_info)

python

# get the full pydanitc object
print(type(response.response))

Accumulate Examples (Beta)

Accumulate with pydantic objects requires some extra parsing. This is still a beta feature, but it's still possible to get accumulate pydantic objects.

python

from typing import List
from pydantic import BaseModel


class Company(BaseModel):
    """Data model for a companies mentioned."""

    company_name: str
    context_info: str

python

from llama_index.core import VectorStoreIndex,
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

index = VectorStoreIndex.from_documents(
    documents,
)

python

query_engine = index.as_query_engine(
    output_cls=Company, response_mode="accumulate", llm=llm
)

python

response = query_engine.query("What companies are mentioned in the text?")

In accumulate, responses are separated by a default separator, and prepended with a prefix.

python

companies = []

# split by the default separator
for response_str in str(response).split("\n---------------------\n"):
    # remove the prefix --  every response starts like `Response 1: {...}`
    # so, we find the first bracket and remove everything before it
    response_str = response_str[response_str.find("{") :]
    companies.append(Company.parse_raw(response_str))

python

print(companies)