content/operate/rc/langcache/monitor-cache.md
You can monitor a LangCache service's performance from the Metrics tab of the service's page.
{{<image filename="images/rc/langcache-metrics.png" alt="The metrics tab of the LangCache service's page." >}}
The Metrics tab provides a series of graphs showing performance data for your LangCache service.
You can switch between hourly, daily, and weekly stats using the Hour, Day, and Week buttons at the top of the page. Each graph also includes minimum, average, maximum, and latest values.
The percentage of requests that were successfully served from the cache without needing to call the LLM API. A healthy cache will generally show an increasing hit ratio over time as it becomes more populated by cached responses.
To optimize your cache hit ratio:
A higher cache hit ratio does not always mean better performance. If the cache is too lenient in its similarity matching, it may return irrelevant responses, leading to a higher cache hit rate but poorer overall performance.
The number of read attempts against the cache at the specified time. This metric can help you understand the load on your cache and identify periods of high or low activity.
The average time to process a cache lookup request. This metric can help you identify performance bottlenecks and optimize your cache configuration.
Cache latency is highly dependent on embedding model performance, since the cache must generate embeddings for each request in order to compare them to the cached responses.
High cache latency may indicate one of the following: