Back to Llama Index

Google GenAI Embeddings

docs/examples/embeddings/google_genai.ipynb

0.14.212.8 KB
Original Source

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/gemini.ipynb" target="_parent"></a>

Google GenAI Embeddings

Using Google's google-genai package, LlamaIndex provides a GoogleGenAIEmbedding class that allows you to embed text using Google's GenAI models from both the Gemini and Vertex AI APIs using the latest gemini-embedding-2-preview model.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

python
%pip install llama-index-embeddings-google-genai
python
import os

os.environ["GOOGLE_API_KEY"] = "..."

Setup

GoogleGenAIEmbedding is a wrapper around the google-genai package, which means it supports both Gemini and Vertex AI APIs out of the box.

You can pass in the api_key directly, or pass in a vertexai_config to use the Vertex AI API.

Other options include embed_batch_size, model_name, and embedding_config.

python
from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
from google.genai.types import EmbedContentConfig

embed_model = GoogleGenAIEmbedding(
    model_name="gemini-embedding-2-preview",
    embed_batch_size=100,
    # can pass in the api key directly
    # api_key="...",
    # or pass in a vertexai_config
    # vertexai_config={
    #     "project": "...",
    #     "location": "...",
    # }
    # can also pass in an embedding_config
    # embedding_config=EmbedContentConfig(...)
)

Usage

Sync

python
embeddings = embed_model.get_text_embedding("Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
python
embeddings = embed_model.get_query_embedding("Query Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
python
embeddings = embed_model.get_text_embedding_batch(
    [
        "Google Gemini Embeddings.",
        "Google is awesome.",
        "Llamaindex is awesome.",
    ]
)
print(f"Got {len(embeddings)} embeddings")
print(f"Dimension of embeddings: {len(embeddings[0])}")

Async

python
embeddings = await embed_model.aget_text_embedding("Google Gemini Embeddings.")
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
python
embeddings = await embed_model.aget_query_embedding(
    "Query Google Gemini Embeddings."
)
print(embeddings[:5])
print(f"Dimension of embeddings: {len(embeddings)}")
python
embeddings = await embed_model.aget_text_embedding_batch(
    [
        "Google Gemini Embeddings.",
        "Google is awesome.",
        "Llamaindex is awesome.",
    ]
)
print(f"Got {len(embeddings)} embeddings")
print(f"Dimension of embeddings: {len(embeddings[0])}")