Back to Llama Index

Jina 8K Context Window Embeddings

docs/examples/embeddings/jina_embeddings.ipynb

0.14.212.7 KB
Original Source

Jina 8K Context Window Embeddings

Here we show you how to use jina-embeddings-v2 which support an 8k context length and is on-par with text-embedding-ada-002

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/jina_embeddings.ipynb" target="_parent"></a>

python
%pip install llama-index-embeddings-huggingface
%pip install llama-index-embeddings-huggingface-api
%pip install llama-index-embeddings-openai
python
import nest_asyncio

nest_asyncio.apply()

Setup Embedding Model

python
from llama_index.embeddings.huggingface import (
    HuggingFaceEmbedding,
)
from llama_index.embeddings.huggingface_api import (
    HuggingFaceInferenceAPIEmbedding,
)
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings
python
# base model
# model_name = "jinaai/jina-embeddings-v2-base-en"
# small model
model_name = "jinaai/jina-embeddings-v2-small-en"
python
# download model locally
# note: you need enough RAM+compute to run this
embed_model = HuggingFaceEmbedding(
    model_name=model_name, trust_remote_code=True
)


# use inference API on Hugging Face (though you might run into rate limit issues)
# embed_model = HuggingFaceInferenceAPIEmbedding(
#     model_name="jinaai/jina-embeddings-v2-base-en",
# )
python
# we set chunk size to 1024 for now, you can obviuosly set it to much bigger
Settings.embed_model = embed_model
Settings.chunk_size = 1024

Setup OpenAI ada embeddings as comparison

python
embed_model_base = OpenAIEmbedding()

Setup Index to test this out

We'll use our standard Paul Graham example.

python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
python
reader = SimpleDirectoryReader("../data/paul_graham")
docs = reader.load_data()
python
index_jina = VectorStoreIndex.from_documents(docs, embed_model=embed_model)
python
index_base = VectorStoreIndex.from_documents(
    docs, embed_model=embed_model_base
)

View Results

Look at retrieved results with Jina-8k vs. Replicate

python
from llama_index.core.response.notebook_utils import display_source_node

retriever_jina = index_jina.as_retriever(similarity_top_k=1)
retriever_base = index_base.as_retriever(similarity_top_k=1)
python
retrieved_nodes = retriever_jina.retrieve(
    "What did the author do in art school?"
)
python
for n in retrieved_nodes:
    display_source_node(n, source_length=2000)
python
retrieved_nodes = retriever_base.retrieve("What did the author do in school?")
python
for n in retrieved_nodes:
    display_source_node(n, source_length=2000)