ChatGPT

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

python

%pip install llama-index-llms-openai

python

!pip install llama-index

python

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from IPython.display import Markdown, display

Download Data

python

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

Load documents, build the VectorStoreIndex

python

# load documents
documents = SimpleDirectoryReader("./data/paul_graham").load_data()

python

# set global settings config
llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.llm = llm
Settings.chunk_size = 512

python

index = VectorStoreIndex.from_documents(documents)

Query Index

By default, with the help of langchain's PromptSelector abstraction, we use a modified refine prompt tailored for ChatGPT-use if the ChatGPT model is used.

python

query_engine = index.as_query_engine(
    similarity_top_k=3,
    streaming=True,
)
response = query_engine.query(
    "What did the author do growing up?",
)

python

response.print_response_stream()

python

query_engine = index.as_query_engine(
    similarity_top_k=5,
    streaming=True,
)
response = query_engine.query(
    "What did the author do during his time at RISD?",
)

python

response.print_response_stream()

Refine Prompt: Here is the chat refine prompt

python

from llama_index.core.prompts.chat_prompts import CHAT_REFINE_PROMPT

python

dict(CHAT_REFINE_PROMPT.prompt)

Query Index (Using the standard Refine Prompt)

If we use the "standard" refine prompt (where the prompt is one text template instead of multiple messages), we find that the results over ChatGPT are worse.

python

from llama_index.core.prompts.default_prompts import DEFAULT_REFINE_PROMPT

python

query_engine = index.as_query_engine(
    refine_template=DEFAULT_REFINE_PROMPT,
    similarity_top_k=5,
    streaming=True,
)
response = query_engine.query(
    "What did the author do during his time at RISD?",
)

python

response.print_response_stream()