Back to Llama Index

Supabase Vector Store

docs/examples/vector_stores/SupabaseVectorIndexDemo.ipynb

0.14.214.2 KB
Original Source

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/vector_stores/SupabaseVectorIndexDemo.ipynb" target="_parent"></a>

Supabase Vector Store

In this notebook we are going to show how to use Vecs to perform vector searches in LlamaIndex.
See this guide for instructions on hosting a database on Supabase

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

python
%pip install llama-index-vector-stores-supabase
python
!pip install llama-index
python
import logging
import sys

# Uncomment to see debug logs
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import SimpleDirectoryReader, Document, StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.supabase import SupabaseVectorStore
import textwrap

Setup OpenAI

The first step is to configure the OpenAI key. It will be used to created embeddings for the documents loaded into the index

python
import os

os.environ["OPENAI_API_KEY"] = "[your_openai_api_key]"

Download Data

python
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

Loading documents

Load the documents stored in the ./data/paul_graham/ using the SimpleDirectoryReader

python
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
print(
    "Document ID:",
    documents[0].doc_id,
    "Document Hash:",
    documents[0].doc_hash,
)

Create an index backed by Supabase's vector store.

This will work with all Postgres providers that support pgvector. If the collection does not exist, we will attempt to create a new collection

Note: you need to pass in the embedding dimension if not using OpenAI's text-embedding-ada-002, e.g. vector_store = SupabaseVectorStore(..., dimension=...)

python
vector_store = SupabaseVectorStore(
    postgres_connection_string=(
        "postgresql://<user>:<password>@<host>:<port>/<db_name>"
    ),
    collection_name="base_demo",
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

Query the index

We can now ask questions using our index.

python
query_engine = index.as_query_engine()
response = query_engine.query("Who is the author?")
python
print(textwrap.fill(str(response), 100))
python
response = query_engine.query("What did the author do growing up?")
python
print(textwrap.fill(str(response), 100))

Using metadata filters

python
from llama_index.core.schema import TextNode

nodes = [
    TextNode(
        **{
            "text": "The Shawshank Redemption",
            "metadata": {
                "author": "Stephen King",
                "theme": "Friendship",
            },
        }
    ),
    TextNode(
        **{
            "text": "The Godfather",
            "metadata": {
                "director": "Francis Ford Coppola",
                "theme": "Mafia",
            },
        }
    ),
    TextNode(
        **{
            "text": "Inception",
            "metadata": {
                "director": "Christopher Nolan",
            },
        }
    ),
]
python
vector_store = SupabaseVectorStore(
    postgres_connection_string=(
        "postgresql://<user>:<password>@<host>:<port>/<db_name>"
    ),
    collection_name="metadata_filters_demo",
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(nodes, storage_context=storage_context)

Define metadata filters

python
from llama_index.core.vector_stores import ExactMatchFilter, MetadataFilters

filters = MetadataFilters(
    filters=[ExactMatchFilter(key="theme", value="Mafia")]
)

Retrieve from vector store with filters

python
retriever = index.as_retriever(filters=filters)
retriever.retrieve("What is inception about?")