content/develop/ai/redisvl/0.8.1/api/cache.md
<a id="semantic-cache-api"></a>
class SemanticCache(name='llmcache', distance_threshold=0.1, ttl=None, vectorizer=None, filterable_fields=None, redis_client=None, redis_url='redis://localhost:6379', connection_kwargs={}, overwrite=False, **kwargs)Bases: BaseLLMCache
Semantic Cache for Large Language Models.
Semantic Cache for Large Language Models.
async acheck(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)Async check the semantic cache for results similar to the specified prompt or vector.
This method searches the cache using vector similarity with either a raw text prompt (converted to a vector) or a provided vector as input. It checks for semantically similar prompts and fetches the cached LLM responses.
response = await cache.acheck(
prompt="What is the capital city of France?"
)
async aclear()Async clear the cache of all keys.
async adelete()Async delete the cache and its index entirely.
async adisconnect()Asynchronously disconnect from Redis and search index.
Closes all Redis connections and index connections.
async adrop(ids=None, keys=None)Async drop specific entries from the cache by ID or Redis key.
NOTEAt least one of ids or keys must be provided.
async aexpire(key, ttl=None)Asynchronously set or refresh the expiration time for a key in the cache.
NOTEIf neither the provided TTL nor the default TTL is set (both are None), this method will have no effect.
async astore(prompt, response, vector=None, metadata=None, filters=None, ttl=None)Async stores the specified key-value pair in the cache along with metadata.
key = await cache.astore(
prompt="What is the capital city of France?",
response="Paris",
metadata={"city": "Paris", "country": "France"}
)
async aupdate(key, **kwargs)Async update specific fields within an existing cache entry. If no fields are passed, then only the document TTL is refreshed.
key = await cache.astore('this is a prompt', 'this is a response')
await cache.aupdate(
key,
metadata={"hit_count": 1, "model_name": "Llama-2-7b"}
)
check(prompt=None, vector=None, num_results=1, return_fields=None, filter_expression=None, distance_threshold=None)Checks the semantic cache for results similar to the specified prompt or vector.
This method searches the cache using vector similarity with either a raw text prompt (converted to a vector) or a provided vector as input. It checks for semantically similar prompts and fetches the cached LLM responses.
response = cache.check(
prompt="What is the capital city of France?"
)
clear()Clear the cache of all keys.
delete()Delete the cache and its index entirely.
disconnect()Disconnect from Redis and search index.
Closes all Redis connections and index connections.
drop(ids=None, keys=None)Drop specific entries from the cache by ID or Redis key.
NOTEAt least one of ids or keys must be provided.
expire(key, ttl=None)Set or refresh the expiration time for a key in the cache.
NOTEIf neither the provided TTL nor the default TTL is set (both are None), this method will have no effect.
set_threshold(distance_threshold)Sets the semantic distance threshold for the cache.
set_ttl(ttl=None)Set the default TTL, in seconds, for entries in the cache.
store(prompt, response, vector=None, metadata=None, filters=None, ttl=None)Stores the specified key-value pair in the cache along with metadata.
key = cache.store(
prompt="What is the capital city of France?",
response="Paris",
metadata={"city": "Paris", "country": "France"}
)
update(key, **kwargs)Update specific fields within an existing cache entry. If no fields are passed, then only the document TTL is refreshed.
key = cache.store('this is a prompt', 'this is a response')
cache.update(key, metadata={"hit_count": 1, "model_name": "Llama-2-7b"})
property aindex: [AsyncSearchIndex]({{< relref "searchindex/#asyncsearchindex" >}}) | NoneThe underlying AsyncSearchIndex for the cache.
property distance_threshold: floatThe semantic distance threshold for the cache.
property index: [SearchIndex]({{< relref "searchindex/#searchindex" >}}) The underlying SearchIndex for the cache.
property ttl: int | NoneThe default TTL, in seconds, for entries in the cache.
<a id="embeddings-cache-api"></a>
class EmbeddingsCache(name='embedcache', ttl=None, redis_client=None, async_redis_client=None, redis_url='redis://localhost:6379', connection_kwargs={})Bases: BaseCache
Embeddings Cache for storing embedding vectors with exact key matching.
Initialize an embeddings cache.
cache = EmbeddingsCache(
name="my_embeddings_cache",
ttl=3600, # 1 hour
redis_url="redis://localhost:6379"
)
async aclear()Async clear the cache of all keys.
async adisconnect()Async disconnect from Redis.
async adrop(text, model_name)Async remove an embedding from the cache.
Asynchronously removes an embedding from the cache.
await cache.adrop(
text="What is machine learning?",
model_name="text-embedding-ada-002"
)
async adrop_by_key(key)Async remove an embedding from the cache by its Redis key.
Asynchronously removes an embedding from the cache by its Redis key.
await cache.adrop_by_key("embedcache:1234567890abcdef")
async aexists(text, model_name)Async check if an embedding exists.
Asynchronously checks if an embedding exists for the given text and model.
if await cache.aexists("What is machine learning?", "text-embedding-ada-002"):
print("Embedding is in cache")
async aexists_by_key(key)Async check if an embedding exists for the given Redis key.
Asynchronously checks if an embedding exists for the given Redis key.
if await cache.aexists_by_key("embedcache:1234567890abcdef"):
print("Embedding is in cache")
async aexpire(key, ttl=None)Asynchronously set or refresh the expiration time for a key in the cache.
NOTEIf neither the provided TTL nor the default TTL is set (both are None), this method will have no effect.
async aget(text, model_name)Async get embedding by text and model name.
Asynchronously retrieves a cached embedding for the given text and model name. If found, refreshes the TTL of the entry.
embedding_data = await cache.aget(
text="What is machine learning?",
model_name="text-embedding-ada-002"
)
async aget_by_key(key)Async get embedding by its full Redis key.
Asynchronously retrieves a cached embedding for the given Redis key. If found, refreshes the TTL of the entry.
embedding_data = await cache.aget_by_key("embedcache:1234567890abcdef")
async amdrop(texts, model_name)Async remove multiple embeddings from the cache by their texts and model name.
Asynchronously removes multiple embeddings in a single operation.
# Remove multiple embeddings asynchronously
await cache.amdrop(
texts=["What is machine learning?", "What is deep learning?"],
model_name="text-embedding-ada-002"
)
async amdrop_by_keys(keys)Async remove multiple embeddings from the cache by their Redis keys.
Asynchronously removes multiple embeddings in a single operation.
# Remove multiple embeddings asynchronously
await cache.amdrop_by_keys(["embedcache:key1", "embedcache:key2"])
async amexists(texts, model_name)Async check if multiple embeddings exist by their texts and model name.
Asynchronously checks existence of multiple embeddings in a single operation.
# Check if multiple embeddings exist asynchronously
exists_results = await cache.amexists(
texts=["What is machine learning?", "What is deep learning?"],
model_name="text-embedding-ada-002"
)
async amexists_by_keys(keys)Async check if multiple embeddings exist by their Redis keys.
Asynchronously checks existence of multiple keys in a single operation.
# Check if multiple keys exist asynchronously
exists_results = await cache.amexists_by_keys(["embedcache:key1", "embedcache:key2"])
async amget(texts, model_name)Async get multiple embeddings by their texts and model name.
Asynchronously retrieves multiple cached embeddings in a single operation. If found, refreshes the TTL of each entry.
# Get multiple embeddings asynchronously
embedding_data = await cache.amget(
texts=["What is machine learning?", "What is deep learning?"],
model_name="text-embedding-ada-002"
)
async amget_by_keys(keys)Async get multiple embeddings by their Redis keys.
Asynchronously retrieves multiple cached embeddings in a single network roundtrip. If found, refreshes the TTL of each entry.
# Get multiple embeddings asynchronously
embedding_data = await cache.amget_by_keys([
"embedcache:key1",
"embedcache:key2"
])
async amset(items, ttl=None)Async store multiple embeddings in a batch operation.
Each item in the input list should be a dictionary with the following fields:
# Store multiple embeddings asynchronously
keys = await cache.amset([
{
"text": "What is ML?",
"model_name": "text-embedding-ada-002",
"embedding": [0.1, 0.2, 0.3],
"metadata": {"source": "user"}
},
{
"text": "What is AI?",
"model_name": "text-embedding-ada-002",
"embedding": [0.4, 0.5, 0.6],
"metadata": {"source": "docs"}
}
])
async aset(text, model_name, embedding, metadata=None, ttl=None)Async store an embedding with its text and model name.
Asynchronously stores an embedding with its text and model name.
key = await cache.aset(
text="What is machine learning?",
model_name="text-embedding-ada-002",
embedding=[0.1, 0.2, 0.3, ...],
metadata={"source": "user_query"}
)
clear()Clear the cache of all keys.
disconnect()Disconnect from Redis.
drop(text, model_name)Remove an embedding from the cache.
cache.drop(
text="What is machine learning?",
model_name="text-embedding-ada-002"
)
drop_by_key(key)Remove an embedding from the cache by its Redis key.
cache.drop_by_key("embedcache:1234567890abcdef")
exists(text, model_name)Check if an embedding exists for the given text and model.
if cache.exists("What is machine learning?", "text-embedding-ada-002"):
print("Embedding is in cache")
exists_by_key(key)Check if an embedding exists for the given Redis key.
if cache.exists_by_key("embedcache:1234567890abcdef"):
print("Embedding is in cache")
expire(key, ttl=None)Set or refresh the expiration time for a key in the cache.
NOTEIf neither the provided TTL nor the default TTL is set (both are None), this method will have no effect.
get(text, model_name)Get embedding by text and model name.
Retrieves a cached embedding for the given text and model name. If found, refreshes the TTL of the entry.
embedding_data = cache.get(
text="What is machine learning?",
model_name="text-embedding-ada-002"
)
get_by_key(key)Get embedding by its full Redis key.
Retrieves a cached embedding for the given Redis key. If found, refreshes the TTL of the entry.
embedding_data = cache.get_by_key("embedcache:1234567890abcdef")
mdrop(texts, model_name)Remove multiple embeddings from the cache by their texts and model name.
Efficiently removes multiple embeddings in a single operation.
# Remove multiple embeddings
cache.mdrop(
texts=["What is machine learning?", "What is deep learning?"],
model_name="text-embedding-ada-002"
)
mdrop_by_keys(keys)Remove multiple embeddings from the cache by their Redis keys.
Efficiently removes multiple embeddings in a single operation.
# Remove multiple embeddings
cache.mdrop_by_keys(["embedcache:key1", "embedcache:key2"])
mexists(texts, model_name)Check if multiple embeddings exist by their texts and model name.
Efficiently checks existence of multiple embeddings in a single operation.
# Check if multiple embeddings exist
exists_results = cache.mexists(
texts=["What is machine learning?", "What is deep learning?"],
model_name="text-embedding-ada-002"
)
mexists_by_keys(keys)Check if multiple embeddings exist by their Redis keys.
Efficiently checks existence of multiple keys in a single operation.
# Check if multiple keys exist
exists_results = cache.mexists_by_keys(["embedcache:key1", "embedcache:key2"])
mget(texts, model_name)Get multiple embeddings by their texts and model name.
Efficiently retrieves multiple cached embeddings in a single operation. If found, refreshes the TTL of each entry.
# Get multiple embeddings
embedding_data = cache.mget(
texts=["What is machine learning?", "What is deep learning?"],
model_name="text-embedding-ada-002"
)
mget_by_keys(keys)Get multiple embeddings by their Redis keys.
Efficiently retrieves multiple cached embeddings in a single network roundtrip. If found, refreshes the TTL of each entry.
# Get multiple embeddings
embedding_data = cache.mget_by_keys([
"embedcache:key1",
"embedcache:key2"
])
mset(items, ttl=None)Store multiple embeddings in a batch operation.
Each item in the input list should be a dictionary with the following fields:
# Store multiple embeddings
keys = cache.mset([
{
"text": "What is ML?",
"model_name": "text-embedding-ada-002",
"embedding": [0.1, 0.2, 0.3],
"metadata": {"source": "user"}
},
{
"text": "What is AI?",
"model_name": "text-embedding-ada-002",
"embedding": [0.4, 0.5, 0.6],
"metadata": {"source": "docs"}
}
])
set(text, model_name, embedding, metadata=None, ttl=None)Store an embedding with its text and model name.
key = cache.set(
text="What is machine learning?",
model_name="text-embedding-ada-002",
embedding=[0.1, 0.2, 0.3, ...],
metadata={"source": "user_query"}
)
set_ttl(ttl=None)Set the default TTL, in seconds, for entries in the cache.
property ttl: int | NoneThe default TTL, in seconds, for entries in the cache.