Back to Llama Index

Text Embedding Inference

docs/examples/embeddings/text_embedding_inference.ipynb

0.14.211.5 KB
Original Source

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/text_embedding_inference.ipynb" target="_parent"></a>

Text Embedding Inference

This notebook demonstrates how to configure TextEmbeddingInference embeddings.

The first step is to deploy the embeddings server. For detailed instructions, see the official repository for Text Embeddings Inference. Or tei-gaudi repository if you are deploying on Habana Gaudi/Gaudi 2.

Once deployed, the code below will connect to and submit embeddings for inference.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

python
%pip install llama-index-embeddings-text-embeddings-inference
python
!pip install llama-index
python
from llama_index.embeddings.text_embeddings_inference import (
    TextEmbeddingsInference,
)


embed_model = TextEmbeddingsInference(
    model_name="BAAI/bge-large-en-v1.5",  # required for formatting inference text,
    timeout=60,  # timeout in seconds
    embed_batch_size=10,  # batch size for embedding
)
python
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
python
embeddings = await embed_model.aget_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])