Back to Langchain4j

Google AI Gemini Embeddings

docs/docs/integrations/embedding-models/google-ai-gemini.md

1.15.114.0 KB
Original Source

Google AI Gemini Embeddings

https://ai.google.dev/gemini-api/docs/embeddings

Table of Contents

Maven Dependency

xml
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-google-ai-gemini</artifactId>
    <version>1.11.7</version>
</dependency>

API Key

Get an API key for free here: https://ai.google.dev/gemini-api/docs/api-key .

Models available

Check the list of available models in the documentation.

  • gemini-embedding-001
    • Input token limit: 2,048
    • Output dimension size: Flexible, supports: 128 - 3072, Recommended: 768, 1536, 3072

GoogleAiEmbeddingModel

The GoogleAiEmbeddingModel allows you to generate embeddings from text using Google AI Gemini's embedding models.

Basic Usage

java
EmbeddingModel embeddingModel = GoogleAiEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .build();

Response<Embedding> response = embeddingModel.embed("Hello, world!");
Embedding embedding = response.content();

Embedding Multiple Texts

java
List<TextSegment> segments = List.of(
    TextSegment.from("First document"),
    TextSegment.from("Second document"),
    TextSegment.from("Third document")
);

Response<List<Embedding>> response = embeddingModel.embedAll(segments);
List<Embedding> embeddings = response.content();

Configuring the Embedding Model

java
EmbeddingModel embeddingModel = GoogleAiEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .taskType(GoogleAiEmbeddingModel.TaskType.RETRIEVAL_DOCUMENT)
    .outputDimensionality(768)
    .titleMetadataKey("title")
    .maxRetries(3)
    .timeout(Duration.ofSeconds(30))
    .logRequestsAndResponses(true)
    .build();

Task Types

The taskType parameter optimizes the embedding for specific use cases:

  • RETRIEVAL_QUERY: For search queries
  • RETRIEVAL_DOCUMENT: For documents to be retrieved (default for document indexing)
  • SEMANTIC_SIMILARITY: For measuring text similarity
  • CLASSIFICATION: For text classification tasks
  • CLUSTERING: For grouping similar texts
  • QUESTION_ANSWERING: For Q&A systems
  • FACT_VERIFICATION: For fact-checking applications

Using Metadata for Document Titles

When using TaskType.RETRIEVAL_DOCUMENT, you can provide document titles via metadata:

java
EmbeddingModel embeddingModel = GoogleAiEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .taskType(GoogleAiEmbeddingModel.TaskType.RETRIEVAL_DOCUMENT)
    .titleMetadataKey("title") // defaults to "title"
    .build();

TextSegment segment = TextSegment.from(
    "This is the document content",
    Metadata.from("title", "My Document Title")
);

Response<Embedding> response = embeddingModel.embed(segment);

Output Dimensionality

You can specify the output dimensionality to reduce the embedding size:

java
EmbeddingModel embeddingModel = GoogleAiEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .outputDimensionality(256) // Reduce from default 768 dimensions
    .build();

Batch Processing

The model automatically batches requests when embedding multiple segments, with a maximum of 100 segments per batch for optimal performance.

Note: This is not the discounted batch API, instead this is a convenience method for processing multiple embeddings.

Batch Embedding Processing

The GoogleAiGeminiBatchEmbeddingModel provides an interface for processing large volumes of embedding requests asynchronously at a reduced cost (50% of standard pricing). It is ideal for non-urgent, large-scale embedding tasks with a 24-hour turnaround SLO.

Creating Batch Embedding Jobs

Inline batch creation:

java
GoogleAiGeminiBatchEmbeddingModel batchModel = GoogleAiGeminiBatchEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .taskType(GoogleAiEmbeddingModel.TaskType.RETRIEVAL_DOCUMENT)
    .outputDimensionality(768)
    .build();

// Create batch of text segments
List<TextSegment> segments = List.of(
    TextSegment.from("First document to embed"),
    TextSegment.from("Second document to embed"),
    TextSegment.from("Third document to embed")
);

// Submit the batch
BatchResponse<Embedding> response = batchModel.createBatchInline(
    "Document Embeddings Batch",  // display name
    0L,                            // priority (optional, defaults to 0)
    segments
);

File-based batch creation:

For larger batches, you can create a batch from an uploaded file:

java
// First, upload a file with batch requests
GeminiFiles filesApi = GeminiFiles.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .build();

GeminiFile uploadedFile = filesApi.uploadFile(
    Paths.get("batch_embeddings.jsonl"),
    "Batch Embedding Requests"
);

// Wait for file to be active
while (uploadedFile.isProcessing()) {
    Thread.sleep(1000);
    uploadedFile = filesApi.getMetadata(uploadedFile.name());
}

// Create batch from file
BatchResponse<Embedding> response = batchModel.createBatchFromFile(
    "My Embedding Batch Job",
    uploadedFile
);

Handling Batch Responses

The BatchResponse is a sealed interface with three possible states:

java
BatchResponse<Embedding> response = batchModel.createBatchInline("My Batch", null, segments);

switch (response) {
    case BatchIncomplete incomplete -> {
        System.out.println("Batch is " + incomplete.state());
        System.out.println("Batch name: " + incomplete.batchName().value());
    }
    case BatchSuccess success -> {
        System.out.println("Batch completed successfully!");
        for (Embedding embedding : success.responses()) {
            System.out.println("Embedding dimensions: " + embedding.dimension());
        }
    }
    case BatchError error -> {
        System.err.println("Batch failed: " + error.message());
        System.err.println("Error code: " + error.code());
        System.err.println("State: " + error.state());
    }
}

Polling for Results

Since batch processing is asynchronous, you need to poll for results:

java
BatchResponse<Embedding> initialResponse = batchModel.createBatchInline(
    "My Batch",
    null,
    segments
);

// Extract the batch name for polling
BatchName batchName = switch (initialResponse) {
    case BatchIncomplete incomplete -> incomplete.batchName();
    case BatchSuccess success -> success.batchName();
    case BatchError error -> throw new RuntimeException("Batch creation failed");
};

// Poll until completion
BatchResponse<Embedding> result;
do {
    Thread.sleep(5000); // Wait 5 seconds between polls
    result = batchModel.retrieveBatchResults(batchName);
} while (result instanceof BatchIncomplete);

// Process final result
if (result instanceof BatchSuccess success) {
    List<Embedding> embeddings = success.responses();
    System.out.println("Generated " + embeddings.size() + " embeddings");
} else if (result instanceof BatchError error) {
    System.err.println("Batch failed: " + error.message());
}

Managing Batch Jobs

Cancel a batch job:

java
BatchName batchName = // ... obtained from createBatchInline or createBatchFromFile

try {
    batchModel.cancelBatchJob(batchName);
    System.out.println("Batch cancelled successfully");
} catch (HttpException e) {
    System.err.println("Failed to cancel batch: " + e.getMessage());
}

Delete a batch job:

java
batchModel.deleteBatchJob(batchName);
System.out.println("Batch deleted successfully");

List batch jobs:

java
// List first page of batch jobs
BatchList<Embedding> batchList = batchModel.listBatchJobs(10, null);

for (BatchResponse<Embedding> batch : batchList.batches()) {
    System.out.println("Batch: " + batch);
}

// Get next page if available
if (batchList.nextPageToken() != null) {
    BatchList<Embedding> nextPage = batchModel.listBatchJobs(10, batchList.nextPageToken());
}

File-Based Batch Processing

For advanced use cases, you can write batch requests to a JSONL file and upload it:

java
// Create a JSONL file with batch requests
Path batchFile = Files.createTempFile("batch", ".jsonl");

try (JsonLinesWriter writer = new StreamingJsonLinesWriter(batchFile)) {
    List<BatchFileRequest<TextSegment>> fileRequests = List.of(
        new BatchFileRequest<>("segment-1", TextSegment.from("First document")),
        new BatchFileRequest<>("segment-2", TextSegment.from("Second document")),
        new BatchFileRequest<>("segment-3", TextSegment.from("Third document"))
    );
    
    batchModel.writeBatchToFile(writer, fileRequests);
}

// Upload the file
GeminiFiles filesApi = GeminiFiles.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .build();

GeminiFile uploadedFile = filesApi.uploadFile(batchFile, "Batch Embedding Requests");

// Create batch from file
BatchResponse<Embedding> response = batchModel.createBatchFromFile(
    "File-Based Embedding Batch",
    uploadedFile
);

Using Metadata with Batch Embeddings

When using TaskType.RETRIEVAL_DOCUMENT, you can include document titles via metadata:

java
GoogleAiGeminiBatchEmbeddingModel batchModel = GoogleAiGeminiBatchEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .taskType(GoogleAiEmbeddingModel.TaskType.RETRIEVAL_DOCUMENT)
    .titleMetadataKey("title")
    .build();

List<TextSegment> segments = List.of(
    TextSegment.from(
        "Content of first document",
        Metadata.from("title", "First Document Title")
    ),
    TextSegment.from(
        "Content of second document",
        Metadata.from("title", "Second Document Title")
    )
);

BatchResponse<Embedding> response = batchModel.createBatchInline(
    "Documents with Titles",
    null,
    segments
);

Configuration

The GoogleAiGeminiBatchEmbeddingModel supports the same configuration options as GoogleAiEmbeddingModel:

java
GoogleAiGeminiBatchEmbeddingModel batchModel = GoogleAiGeminiBatchEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .taskType(GoogleAiEmbeddingModel.TaskType.RETRIEVAL_DOCUMENT)
    .outputDimensionality(768)
    .titleMetadataKey("title")
    .maxRetries(3)
    .timeout(Duration.ofSeconds(30))
    .logRequestsAndResponses(true)
    .build();

Important Constraints

  • Size Limit: The inline API supports a total request size of 20 MB or under
  • Batch Size: Maximum of 100 segments per batch for optimal performance
  • Cost: Batch processing offers 50% cost reduction compared to real-time requests
  • Turnaround: 24-hour SLO, though completion is often much quicker
  • Use Cases: Best for large-scale embedding generation for document indexing or semantic search

Example: Complete Workflow

java
GoogleAiGeminiBatchEmbeddingModel batchModel = GoogleAiGeminiBatchEmbeddingModel.builder()
    .apiKey(System.getenv("GEMINI_AI_KEY"))
    .modelName("gemini-embedding-001")
    .taskType(GoogleAiEmbeddingModel.TaskType.RETRIEVAL_DOCUMENT)
    .outputDimensionality(768)
    .build();

// Prepare batch of text segments
List<TextSegment> segments = new ArrayList<>();
for (int i = 0; i < 500; i++) {
    segments.add(TextSegment.from(
        "Document content #" + i,
        Metadata.from("title", "Document " + i)
    ));
}

// Submit batch
BatchResponse<Embedding> response = batchModel.createBatchInline(
    "Large Document Collection",
    0L,
    segments
);

// Get batch name
BatchName batchName = switch (response) {
    case BatchIncomplete incomplete -> incomplete.batchName();
    case BatchSuccess success -> success.batchName();
    case BatchError error -> throw new RuntimeException("Failed: " + error.message());
};

// Poll for completion
BatchResponse<Embedding> finalResult;
int attempts = 0;
int maxAttempts = 720; // 1 hour with 5-second intervals

do {
    if (attempts++ >= maxAttempts) {
        throw new RuntimeException("Batch processing timeout");
    }
    Thread.sleep(5000);
    finalResult = batchModel.retrieveBatchResults(batchName);
    
    if (finalResult instanceof BatchIncomplete incomplete) {
        System.out.println("Status: " + incomplete.state());
    }
} while (finalResult instanceof BatchIncomplete);

// Process results
if (finalResult instanceof BatchSuccess success) {
    List<Embedding> embeddings = success.responses();
    System.out.println("Generated " + embeddings.size() + " embeddings");
    
    // Store embeddings in your vector database
    for (int i = 0; i < embeddings.size(); i++) {
        Embedding embedding = embeddings.get(i);
        System.out.println("Embedding " + i + " has " + embedding.dimension() + " dimensions");
        // vectorStore.add(embedding, segments.get(i));
    }
} else if (finalResult instanceof BatchError error) {
    System.err.println("Batch failed: " + error.message());
}

Learn more

If you're interested in learning more about the Google AI Gemini embedding models, please have a look at the documentation.