docs/design/cache.md
The Kubernetes Dashboard has been around for a long time now, and one of its pain points have always been the performance and responsiveness when running in clusters with a large number of resources. Given that, we have been thinking about implementing a proper API caching solution to enhance overall responsiveness and user experience. As clusters grow in size and complexity, users often face latency issues when interacting with the Dashboard, which can lead to inefficiencies in managing and troubleshooting applications. By implementing a proper caching solution, we can significantly reduce the time it takes to retrieve resource data, decrease peak memory usage and optimize overall resource consumption, thereby minimizing delays and improving the fluidity of the user interface.
The primary goals of implementing the API caching solution are to:
This proposal does not aim to:
The proposed solution involves implementing a caching layer within the Kubernetes Dashboard that stores a configurable number of API responses for a configurable duration. This caching layer will hook into Kubernetes client interfaces and serve cached data when available, falling back to the API server only when necessary. The solution will leverage techniques such as time and cost-based expiration and cache invalidation strategies to ensure data freshness while balancing performance.
In general, it will resemble the "cache-and-network" type of caching due to the nature of Dashboard auth layer. Since Dashboard does not require any permissions on its own, it has to rely on the user permissions and the only time when it can act as a user is the time from receiving a request to sending a response. Such an architecture requires an on-the-fly client creation as well as background cache updates.
To ensure that cached data will not be served to unauthorized entities, every time before API returns data from the cache, it will first create a Self Subject Access Review request to the API server to validate user permissions.
It is especially important in a multi-cluster scenarios where Dashboard API is used to access multiple clusters. To avoid the situation where path stored in cache could be served from the wrong cluster context, multi-cluster cache context needs to have a way to exchange user authorization token for a unique context ID and it has to be a part of the cache key.
Cache key should consist of the below fields:
v1.ListOptions should also be part of the key to ensure that filtered API requests are stored under a separate cache keySHA should be created based on the above key structure and used as an internal cache key.
These sequence diagrams show simplified way of how cache works.
The flow is very similar to the standard caching with the difference being that provided user authorization token has to be able to be exchanged for the unique context ID using configured token-exchange-endpoint. It is then used to create unique cache key.
Cache is implemented with the help of Theine package. It provides in-memory cache that has good performance, supports generics and keeps its API simple.
Cache is a global variable initialized during application startup. It maps internal key SHAs to the resource lists.
It can be configured via the following arguments:
cache-enabled - Enables the cache. Enabled by default.cache-size - Maximum number of items in the cache. Set to 1000 by default.cache-ttl - Cache entry time-to-live. Set to 10 minutes by default.cache-refresh-debounce - Minimal time that has to pass between consecutive cache refreshes in the background. Set to 5 seconds by default.cluster-context-enabled - Enables multi-context cache. Disabled by default. Requires token-exchange-endpoint to be set if enabled.token-exchange-endpoint - Endpoint used when multi-context cache is enabled. It exchanges tokens for a context identifiers. It has to be HTTP(s) GET that returns raw string with context identifier and accepts Authorization: Bearer <token> header.Cache package provides following interface:
Get - fetches item from the cache.Set - stores item in the cache.DefferedLoad - updates cache in the background. Used after cache is read to refresh items.SyncedLoad - initializes the cache ensuring that there will be no concurrent calls to the Kubernetes API for the same resources.In order to minimize the amount of code, we have created custom interfaces similar to the client-go interfaces where we could override only a single List method and still use their generic client.Interface. This way our internal implementation and usage of kubernetes client did not have to change at all and we were able to inject cached client globally.
The initial implementation supports caching of the following resources:
Whole cache implementation lives under modules/common/client/cache.