docs/examples/data_connectors/GoogleChatDemo.ipynb
Demonstrates our Google Chat data connector.
If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
%pip install llama-index llama-index-readers-google
This loader takes in IDs of Google Chat spaces or messages and parses the chat history into Documents. The space/message ID can be found in the URL, as shown below:
Before using this loader, you need to create a Google Cloud Platform (GCP) project with a Google Workspace account. Then, you need to authorize the app with user credentials. Follow the prerequisites and steps 1 and 2 of this guide. After downloading the client secret JSON file, rename it as credentials.json and save it into your project folder.
This example parses a chat between two users. They first discuss math homework, then they plan a trip to San Francisco in a thread. At the end, they discuss finishing an essay. See the full thread here.
The example below loads the entire chat history into a SummaryIndex.
from llama_index.core import SummaryIndex
from llama_index.readers.google import GoogleChatReader
space_ids = [
"AAAAtTPwdzg"
] # The Google account you authenticated with must have access to this space
reader = GoogleChatReader()
docs = reader.load_data(space_names=space_ids)
index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query("What was the overall conversation about?")
from IPython.display import Markdown, display
display(Markdown(f"{response}"))
You can order the chat history by ascending or descending order.
docs = reader.load_data(space_names=space_ids, order_asc=False)
index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
"List the things that the users discussed in the order they were discussed in. Make the list short."
)
display(Markdown(f"{response}"))
Even though the messages were retrieved in reverse order, the list is still in the correct order because messages have a timestamp in their metadata.
Messages can be limited to a certain number using the num_messages parameter. However, the number of messages that are loaded may not be exactly this number. If order_asc is True, then takes the first num_messages messages within the given time frame. If order_desc is True, then takes the last num_messages messages within the time frame.
docs = reader.load_data(
space_names=space_ids, num_messages=10
) # in ascending order, only contains messages about math HW
index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query("What was discussed in this conversation?")
display(Markdown(f"{response}"))
Notice that the summary is only about the first 10 messages, which only involves help on the math homework. Below is an example of retrieving the last 16 messages, which only involves the essay. The "cost of a trip" refers to a reply in the SF trip thread that was made during the discussion of the essay.
docs = reader.load_data(
space_names=space_ids, num_messages=16, order_asc=False
) # in descending order, only contains messages about essay
index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query("What was discussed in this conversation?")
display(Markdown(f"{response}"))
A before and after time frame can also be specified. These parameters take in datetime objects.
import datetime
date1 = datetime.datetime.fromisoformat(
"2024-06-25 14:27:00-07:00"
) # when they start talking about trip
docs = reader.load_data(space_names=space_ids, before=date1)
index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
"What was discussed in this conversation?"
) # should only be about math HW
display(Markdown(f"{response}"))
date2 = datetime.datetime.fromisoformat(
"2024-06-25 14:51:00-07:00"
) # when they start talking about essay
docs = reader.load_data(space_names=space_ids, after=date2)
index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
"What was discussed in this conversation?"
) # should only be about essay + cost of trip (in thread)
display(Markdown(f"{response}"))
docs = reader.load_data(space_names=space_ids, after=date1, before=date2)
index = SummaryIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query(
"What was discussed in this conversation?"
) # should only be about trip
display(Markdown(f"{response}"))