docs/examples/embeddings/nomic.ipynb
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/nomic.ipynb" target="_parent"></a>
Nomic has released v1.5 🪆🪆🪆 is capable of variable sized embeddings with matryoshka learning and an 8192 context, embedding dimensions between 64 and 768.
In this notebook, we will explore using Nomic v1.5 embedding at different dimensions.
%pip install -U llama-index llama-index-embeddings-nomic
nomic_api_key = "<NOMIC API KEY>"
import nest_asyncio
nest_asyncio.apply()
from llama_index.embeddings.nomic import NomicEmbedding
embed_model = NomicEmbedding(
api_key=nomic_api_key,
dimensionality=128,
model_name="nomic-embed-text-v1.5",
)
embedding = embed_model.get_text_embedding("Nomic Embeddings")
print(len(embedding))
embedding[:5]
embed_model = NomicEmbedding(
api_key=nomic_api_key,
dimensionality=256,
model_name="nomic-embed-text-v1.5",
)
embedding = embed_model.get_text_embedding("Nomic Embeddings")
print(len(embedding))
embedding[:5]
embed_model = NomicEmbedding(
api_key=nomic_api_key,
dimensionality=768,
model_name="nomic-embed-text-v1.5",
)
embedding = embed_model.get_text_embedding("Nomic Embeddings")
print(len(embedding))
embedding[:5]
It has 768 fixed embedding dimensions
embed_model = NomicEmbedding(
api_key=nomic_api_key, model_name="nomic-embed-text-v1"
)
embedding = embed_model.get_text_embedding("Nomic Embeddings")
print(len(embedding))
embedding[:5]
We will use OpenAI for Generation step.
from llama_index.core import settings
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
import os
os.environ["OPENAI_API_KEY"] = "<YOUR OPENAI API KEY>"
embed_model = NomicEmbedding(
api_key=nomic_api_key,
dimensionality=128,
model_name="nomic-embed-text-v1.5",
)
llm = OpenAI(model="gpt-3.5-turbo")
settings.llm = llm
settings.embed_model = embed_model
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("what did author do growing up?")
print(response)