docs/examples/vector_stores/ChromaFireworksNomic.ipynb
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/vector_stores/ChromaIndexDemo.ipynb" target="_parent"></a>
This example is adapted from the ChromaIndex example, + how to use Matryoshka embedding from Nomic on top of Fireworks.ai.
<a href="https://discord.gg/MMeYNTmh3x" target="_blank">Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.
</a> <a href="https://github.com/chroma-core/chroma/blob/master/LICENSE" target="_blank">
</a>
Chroma is fully-typed, fully-tested and fully-documented.
Install Chroma with:
pip install chromadb
Chroma runs in various modes. See below for examples of each integrated with LangChain.
in-memory - in a python script or jupyter notebookin-memory with persistence - in a script or notebook and save/load to diskin a docker container - as a server running your local machine or in the cloudLike any other database, you can:
.add.get.update.upsert.delete.peek.query runs the similarity search.View full docs at docs.
Nomic published a new embedding model nomic-ai/nomic-embed-text-v1.5 that is capable of returning variable embedding size depending on how cost sensitive you are. For more information please check out their model here and their website here
Fireworks is the leading OSS model inference provider. In this example we will use Fireworks to run the nomic model as well as mixtral-8x7b-instruct model as the query engine. For more information about fireworks, please check out their website here
In this basic example, we take the Paul Graham essay, split it into chunks, embed it using an open-source embedding model, load it into Chroma, and then query it.
If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
%pip install -q llama-index-vector-stores-chroma llama-index-llms-fireworks llama-index-embeddings-fireworks==0.1.2
%pip install -q llama-index
!pip install llama-index chromadb --quiet
!pip install -q chromadb
!pip install -q pydantic==1.10.11
# import
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.embeddings.fireworks import FireworksEmbedding
from llama_index.llms.fireworks import Fireworks
from IPython.display import Markdown, display
import chromadb
# set up Fireworks.ai Key
import getpass
fw_api_key = getpass.getpass("Fireworks API Key:")
Download Data
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
from llama_index.llms.fireworks import Fireworks
from llama_index.embeddings.fireworks import FireworksEmbedding
llm = Fireworks(
temperature=0, model="accounts/fireworks/models/mixtral-8x7b-instruct"
)
# create client and a new collection
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")
# define embedding function
embed_model = FireworksEmbedding(
model_name="nomic-ai/nomic-embed-text-v1.5",
)
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
# set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context, embed_model=embed_model
)
# Query Data
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What did the author do growing up?")
display(Markdown(f"<b>{response}</b>"))
Extending the previous example, if you want to save to disk, simply initialize the Chroma client and pass the directory where you want the data to be saved to.
Caution: Chroma makes a best-effort to automatically save data to disk, however multiple in-memory clients can stomp each other's work. As a best practice, only have one client per path running at any given time.
Also we are going to resize the embeddings down to 128 dimensions. This is helpful for cases where you are cost conscious on the database side.
# save to disk
db = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
embed_model = FireworksEmbedding(
model_name="nomic-ai/nomic-embed-text-v1.5",
api_base="https://api.fireworks.ai/inference/v1",
dimensions=128,
)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context, embed_model=embed_model
)
# load from disk
db2 = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db2.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
index = VectorStoreIndex.from_vector_store(
vector_store,
embed_model=embed_model,
)
# Query Data from the persisted index
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What did the author do growing up?")
display(Markdown(f"<b>{response}</b>"))
You can see that the results are the same across the two, so you can experiment with the dimension sizes and then experiment with the cost and quality trade-off yourself.