docs/examples/embeddings/jina_embeddings.ipynb
Here we show you how to use jina-embeddings-v2 which support an 8k context length and is on-par with text-embedding-ada-002
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/jina_embeddings.ipynb" target="_parent"></a>
%pip install llama-index-embeddings-huggingface
%pip install llama-index-embeddings-huggingface-api
%pip install llama-index-embeddings-openai
import nest_asyncio
nest_asyncio.apply()
from llama_index.embeddings.huggingface import (
HuggingFaceEmbedding,
)
from llama_index.embeddings.huggingface_api import (
HuggingFaceInferenceAPIEmbedding,
)
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings
# base model
# model_name = "jinaai/jina-embeddings-v2-base-en"
# small model
model_name = "jinaai/jina-embeddings-v2-small-en"
# download model locally
# note: you need enough RAM+compute to run this
embed_model = HuggingFaceEmbedding(
model_name=model_name, trust_remote_code=True
)
# use inference API on Hugging Face (though you might run into rate limit issues)
# embed_model = HuggingFaceInferenceAPIEmbedding(
# model_name="jinaai/jina-embeddings-v2-base-en",
# )
# we set chunk size to 1024 for now, you can obviuosly set it to much bigger
Settings.embed_model = embed_model
Settings.chunk_size = 1024
embed_model_base = OpenAIEmbedding()
We'll use our standard Paul Graham example.
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
reader = SimpleDirectoryReader("../data/paul_graham")
docs = reader.load_data()
index_jina = VectorStoreIndex.from_documents(docs, embed_model=embed_model)
index_base = VectorStoreIndex.from_documents(
docs, embed_model=embed_model_base
)
Look at retrieved results with Jina-8k vs. Replicate
from llama_index.core.response.notebook_utils import display_source_node
retriever_jina = index_jina.as_retriever(similarity_top_k=1)
retriever_base = index_base.as_retriever(similarity_top_k=1)
retrieved_nodes = retriever_jina.retrieve(
"What did the author do in art school?"
)
for n in retrieved_nodes:
display_source_node(n, source_length=2000)
retrieved_nodes = retriever_base.retrieve("What did the author do in school?")
for n in retrieved_nodes:
display_source_node(n, source_length=2000)