Jina 8K Context Window Embeddings

Here we show you how to use jina-embeddings-v2 which support an 8k context length and is on-par with text-embedding-ada-002

python

%pip install llama-index-embeddings-huggingface
%pip install llama-index-embeddings-huggingface-api
%pip install llama-index-embeddings-openai

python

import nest_asyncio

nest_asyncio.apply()

Setup Embedding Model

python

from llama_index.embeddings.huggingface import (
    HuggingFaceEmbedding,
)
from llama_index.embeddings.huggingface_api import (
    HuggingFaceInferenceAPIEmbedding,
)
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

python

# base model
# model_name = "jinaai/jina-embeddings-v2-base-en"
# small model
model_name = "jinaai/jina-embeddings-v2-small-en"

python

# download model locally
# note: you need enough RAM+compute to run this
embed_model = HuggingFaceEmbedding(
    model_name=model_name, trust_remote_code=True
)


# use inference API on Hugging Face (though you might run into rate limit issues)
# embed_model = HuggingFaceInferenceAPIEmbedding(
#     model_name="jinaai/jina-embeddings-v2-base-en",
# )

python

# we set chunk size to 1024 for now, you can obviuosly set it to much bigger
Settings.embed_model = embed_model
Settings.chunk_size = 1024

Setup OpenAI ada embeddings as comparison

python

embed_model_base = OpenAIEmbedding()

Setup Index to test this out

We'll use our standard Paul Graham example.

python

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

python

reader = SimpleDirectoryReader("../data/paul_graham")
docs = reader.load_data()

python

index_jina = VectorStoreIndex.from_documents(docs, embed_model=embed_model)

python

index_base = VectorStoreIndex.from_documents(
    docs, embed_model=embed_model_base
)

View Results

Look at retrieved results with Jina-8k vs. Replicate

python

from llama_index.core.response.notebook_utils import display_source_node

retriever_jina = index_jina.as_retriever(similarity_top_k=1)
retriever_base = index_base.as_retriever(similarity_top_k=1)

python

retrieved_nodes = retriever_jina.retrieve(
    "What did the author do in art school?"
)

python

for n in retrieved_nodes:
    display_source_node(n, source_length=2000)

python

retrieved_nodes = retriever_base.retrieve("What did the author do in school?")

python

for n in retrieved_nodes:
    display_source_node(n, source_length=2000)