llama-index-integrations/embeddings/llama-index-embeddings-isaacus/README.md
The llama-index-embeddings-isaacus package contains LlamaIndex integrations for building applications with Isaacus' legal AI embedding models. This integration allows you to easily connect to and use state-of-the-art legal embeddings via the Isaacus API.
pip install llama-index
pip install llama-index-embeddings-isaacus
Head to the Isaacus Platform to create a new account.
Once signed up, add a payment method to claim your free credits.
After adding a payment method, create a new API key.
Make sure to keep your API key safe. You won't be able to see it again after you create it. But don't worry, you can always generate a new one.
Export your API key as an environment variable:
export ISAACUS_API_KEY="your-api-key-here"
from llama_index.embeddings.isaacus import IsaacusEmbedding
# Initialize the Isaacus Embedding model
# This uses the ISAACUS_API_KEY environment variable
embedding_model = IsaacusEmbedding()
# Get a single embedding
embedding = embedding_model.get_text_embedding("Legal document text here")
print(f"Embedding dimension: {len(embedding)}")
# Get embeddings for multiple texts
texts = ["Contract clause 1", "Contract clause 2", "Legal precedent"]
embeddings = embedding_model.get_text_embedding_batch(texts)
print(f"Number of embeddings: {len(embeddings)}")
You can also pass parameters directly and customize the embedding behavior:
import os
from llama_index.embeddings.isaacus import IsaacusEmbedding
embedding_model = IsaacusEmbedding(
model="kanon-2-embedder", # Currently the only model available
api_key=os.getenv("ISAACUS_API_KEY"),
dimensions=1792, # Optional: reduce dimensionality
task="retrieval/document", # Optimize for document retrieval
timeout=60.0,
)
print(embedding_model.get_text_embedding("Legal text to embed"))
Isaacus embeddings support task-specific optimization. Use task="retrieval/query" for search queries and task="retrieval/document" for documents:
from llama_index.embeddings.isaacus import IsaacusEmbedding
# For documents
doc_embedder = IsaacusEmbedding(task="retrieval/document")
doc_embedding = doc_embedder.get_text_embedding("This is a legal document.")
# For queries (this is the default for get_query_embedding)
query_embedder = IsaacusEmbedding()
query_embedding = query_embedder.get_query_embedding(
"Find documents about contracts"
)
The integration also supports async operations:
import asyncio
from llama_index.embeddings.isaacus import IsaacusEmbedding
async def get_embeddings_async():
embedding_model = IsaacusEmbedding()
# Get async embeddings
embedding = await embedding_model.aget_text_embedding("Legal text here")
embeddings = await embedding_model.aget_text_embedding_batch(
["Text 1", "Text 2"]
)
return embedding, embeddings
# Run async function
result = asyncio.run(get_embeddings_async())
print(result)
See the ./examples directory for more, runnable examples.
cd examples
uv run python basic_usage.py
from llama_index.core import VectorStoreIndex, Settings
from llama_index.embeddings.isaacus import IsaacusEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Document
# Set the LLM
llm = OpenAI()
Settings.llm = llm
# Set the Isaacus embedding model globally
Settings.embed_model = IsaacusEmbedding()
# Create documents
documents = [
Document(text="This is a contract clause about payment terms."),
Document(text="This is a contract clause about termination."),
]
# Create a vector index
index = VectorStoreIndex.from_documents(documents)
# Query the index
query_engine = index.as_query_engine(
llm=llm, response_mode="compact", similarity_top_k=5
)
response = query_engine.query("What are the payment terms?")
print(response)
Currently, Isaacus offers the following embedding model:
For more information about Isaacus models, see the Isaacus documentation.
The integration includes proper error handling for common issues:
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | "kanon-2-embedder" | The embedding model to use |
api_key | str | os.getenv("ISAACUS_API_KEY") | The API key for Isaacus |
base_url | str | "https://api.isaacus.com/v1" | The base URL for Isaacus API |
dimensions | int | None (model default) | Optional: reduce embedding dimensionality |
task | str | None | Task type: "retrieval/query" or "retrieval/document" |
overflow_strategy | str | "drop_end" | Strategy for handling overflow: "drop_end" or None |
timeout | float | 60.0 | Timeout for requests in seconds |
embed_batch_size | int | 100 | Batch size for embedding calls |
| Variable | Description |
|---|---|
ISAACUS_API_KEY | The API key for Isaacus (required) |
ISAACUS_BASE_URL | The base URL for Isaacus API (optional) |
Run the test suite:
uv run -- pytest
Run with coverage:
uv run -- pytest --cov=llama_index tests/
For more information about Isaacus and its legal AI models:
This project is licensed under the MIT License.