Back to Llama Index

10K Analysis

docs/examples/usecases/10k_sub_question.ipynb

0.14.212.8 KB
Original Source

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/usecases/10k_sub_question.ipynb" target="_parent"></a>

10K Analysis

In this demo, we explore answering complex queries by decomposing them into simpler sub-queries.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

python
%pip install llama-index-llms-openai
python
!pip install llama-index
python
import nest_asyncio

nest_asyncio.apply()
python
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.openai import OpenAI

from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine

Configure LLM service

python
import os

os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"
python
from llama_index.core import Settings

Settings.llm = OpenAI(temperature=0.2, model="gpt-3.5-turbo")

Download Data

python
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'

Load data

python
lyft_docs = SimpleDirectoryReader(
    input_files=["./data/10k/lyft_2021.pdf"]
).load_data()
uber_docs = SimpleDirectoryReader(
    input_files=["./data/10k/uber_2021.pdf"]
).load_data()

Build indices

python
lyft_index = VectorStoreIndex.from_documents(lyft_docs)
python
uber_index = VectorStoreIndex.from_documents(uber_docs)

Build query engines

python
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
python
uber_engine = uber_index.as_query_engine(similarity_top_k=3)
python
query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021"
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021"
            ),
        ),
    ),
]

s_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools
)

Run queries

python
response = s_engine.query(
    "Compare and contrast the customer segments and geographies that grew the"
    " fastest"
)
python
print(response)
python
response = s_engine.query(
    "Compare revenue growth of Uber and Lyft from 2020 to 2021"
)
python
print(response)