HyperCache System

PostHog's HyperCache provides multi-tier caching with Redis → S3 → Database fallback. It's designed for high-traffic, read-heavy endpoints where pre-caching every possible value is worth the storage cost.

Architecture overview

text

┌─────────────────────────────────────────────────────────────────┐
│                         Client Request                          │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                        Redis (Layer 1)                          │
│                        TTL: 30 days                             │
│                        Latency: ~1-2ms                          │
└─────────────────────────────────────────────────────────────────┘
                                │ miss
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                         S3 (Layer 2)                            │
│                        Latency: ~50-100ms                       │
│                        Warms Redis on hit                       │
└─────────────────────────────────────────────────────────────────┘
                                │ miss
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                      Database (Layer 3)                         │
│                        Latency: ~200-500ms                      │
│                        Warms Redis + S3                         │
└─────────────────────────────────────────────────────────────────┘

Core components

Component	File	Purpose
HyperCache class	`posthog/storage/hypercache.py`	Multi-tier cache with fallback
Local evaluation	`posthog/models/feature_flag/local_evaluation.py`	Feature flag caching for SDKs
Remote config	`posthog/models/remote_config.py`	Client configuration caching
Team caching	`posthog/models/team/team_caching.py`	Authentication team lookup cache

HyperCache class

The HyperCache class in posthog/storage/hypercache.py provides the core caching logic.

Creating an instance

python

from posthog.storage.hypercache import HyperCache

cache = HyperCache(
    namespace="feature_flags",           # Category name for metrics
    value="flags_with_cohorts.json",     # Value identifier for metrics
    load_fn=lambda key: load_data(key),  # Fallback loader when cache misses
    token_based=False,                   # Use team ID (False) or API token (True)
    cache_ttl=60 * 60 * 24 * 30,        # 30 days for cache hits
    cache_miss_ttl=60 * 60 * 24,        # 1 day for cache misses
    enable_etag=True,                    # Enable HTTP 304 support
)

Cache key formats

The token_based parameter determines the cache key structure:

Team ID-based (token_based=False): cache/teams/{team_id}/{namespace}/{value}
Token-based (token_based=True): cache/team_tokens/{api_token}/{namespace}/{value}

Key methods

Method	Purpose
`get_from_cache(key)`	Get cached data, returns `None` on miss
`get_from_cache_with_source(key)`	Returns `(data, source)` where source is `redis`, `s3`, or `db`
`get_if_none_match(key, etag)`	ETag support for HTTP 304 responses
`update_cache(key)`	Force refresh from database
`set_cache_value(key, data)`	Write to both Redis and S3
`clear_cache(key)`	Delete from Redis and S3

ETag support

HyperCache supports HTTP 304 "Not Modified" responses to reduce bandwidth:

python

data, etag, modified = cache.get_if_none_match(team, client_etag)

if not modified:
    return HttpResponse(status=304, headers={"ETag": etag})
else:
    return JsonResponse(data, headers={"ETag": etag})

ETags are computed as SHA256 hashes of the JSON content.

Service cache (Rust)

The feature-flags Rust evaluation service uses a separate HyperCache instance defined in posthog/models/feature_flag/flags_cache.py. Unlike the local evaluation cache (which serves SDKs with cohort definitions and group type mappings), the service cache provides raw flag data plus pre-computed dependency metadata so the Rust service can evaluate flags in the correct order without recomputing the dependency graph on every request.

Cache instance

python

# posthog/models/feature_flag/flags_cache.py
flags_hypercache = HyperCache(
    namespace="feature_flags",
    value="flags.json",
    load_fn=lambda key: _get_feature_flags_for_service(HyperCache.team_from_key(key)),
    cache_ttl=settings.FLAGS_CACHE_TTL,
    cache_miss_ttl=settings.FLAGS_CACHE_MISS_TTL,
    cache_alias=FLAGS_DEDICATED_CACHE_ALIAS if FLAGS_DEDICATED_CACHE_ALIAS in settings.CACHES else None,
    batch_load_fn=_get_feature_flags_for_teams_batch,
    expiry_sorted_set_key=FLAGS_CACHE_EXPIRY_SORTED_SET,
)

The _get_feature_flags_for_service function fetches all flags for a team (including inactive, but excluding deleted and encrypted remote config flags) and returns the cache payload. The Rust service filters out inactive flags at request time via filtered_out_flag_ids.

Cache payload structure

json

{
  "flags": [
    /* serialized flag dicts */
  ],
  "evaluation_metadata": {
    "dependency_stages": [[1, 5], [3], [7]],
    "flags_with_missing_deps": [9, 12],
    "transitive_deps": { "3": [1, 5], "7": [1, 3, 5] }
  }
}

The evaluation_metadata fields:

Field	Type	Description
`dependency_stages`	`list[list[int]]`	Flag IDs grouped by evaluation order. Stage 0 contains flags with no dependencies; stage N depends only on flags in stages 0…N-1. Flags within the same stage can be evaluated in parallel.
`flags_with_missing_deps`	`list[int]`	Flag IDs whose dependencies are missing, cyclic, or transitively broken. The Rust service treats these as evaluation errors.
`transitive_deps`	`dict[str, list[int]]`	Map of stringified flag ID to the sorted list of all its transitive dependency flag IDs.

Dependency computation

The _compute_flag_dependencies function in flags_cache.py builds the evaluation metadata using Kahn's algorithm (layered topological sort). The algorithm:

Extracts direct dependencies from each flag's filters.groups[*].properties where type == "flag".
Builds an in-degree map and reverse-dependency edges across all flags.
Peels layers of zero-in-degree nodes, computing transitive dependency closures as it goes. Each layer becomes one evaluation stage.
Detects cycles — any flag still with in-degree > 0 after all layers are peeled is a cycle participant. Cycled flags and any flag that transitively depends on them are added to flags_with_missing_deps.

This matches the Rust fallback path's petgraph-based cycle handling, where all cycle participants are excluded from stages (not just back-edge targets).

Local evaluation caching

Feature flag local evaluation uses two separate HyperCache instances in posthog/models/feature_flag/local_evaluation.py:

python

# Full flags with cohort definitions (for smart clients)
flag_definitions_hypercache = HyperCache(
    namespace="feature_flags",
    value="flags_with_cohorts.json",
    load_fn=lambda key: _get_flags_response_for_local_evaluation(team, include_cohorts=True),
    enable_etag=True,
)

# Simplified flags without cohorts (legacy)
flag_definitions_without_cohorts_hypercache = HyperCache(
    namespace="feature_flags",
    value="flags_without_cohorts.json",
    load_fn=lambda key: _get_flags_response_for_local_evaluation(team, include_cohorts=False),
    enable_etag=True,
)

All current SDKs support cohort evaluation locally, so the dual-cache strategy is legacy. The flag_definitions_without_cohorts_hypercache cache exists for older SDK versions that couldn't handle cohort definitions. Once requests for flags without cohorts decline sufficiently, this cache can be removed.

Cache invalidation

Django signals trigger cache updates when models change:

python

@receiver([post_save, post_delete], sender=FeatureFlag)
def feature_flag_changed(sender, instance, **kwargs):
    transaction.on_commit(lambda: update_team_flags_cache.delay(instance.team_id))

Models that invalidate the flags cache:

FeatureFlag - Flag created, updated, or deleted
Cohort - Cohort properties changed
FeatureFlagEvaluationTag - Evaluation tags changed
Tag - Tag renamed

Remote config caching

Remote config uses a different caching strategy than local evaluation. It prioritizes zero-overhead cache hits by using Redis-only caching for serving, with HyperCache used only for background sync operations.

Cache lookup flow

python

def _get_config_via_cache(cls, token: str) -> dict:
    key = f"remote_config/{token}/config"

    data = cache.get(key)  # Direct Redis lookup
    if data:
        return data  # Cache hit - zero database calls

    # Cache miss - query database and warm cache
    remote_config = RemoteConfig.objects.select_related("team").get(team__api_token=token)
    data = remote_config.build_config()
    cache.set(key, data, timeout=CACHE_TIMEOUT)  # 1 day

    return data

No authentication

Remote config endpoints have no authentication (authentication_classes = []). The token in the URL is treated as a public identifier, not a credential. This eliminates all authentication overhead for cache hits.

CDN integration

When remote config changes, the system:

Updates the database record
Warms the HyperCache (Redis + S3)
Updates the Redis serving cache
Purges Cloudflare CDN cache

Team authentication caching

Team objects are cached by API token in posthog/models/team/team_caching.py to avoid database lookups during authentication.

Cache format

python

# Cache key
f"team_token:{api_token}"

# TTL: 5 days
FIVE_DAYS = 60 * 60 * 24 * 5

Usage

python

from posthog.models.team.team_caching import get_team_in_cache, set_team_in_cache

# Check cache first
team = get_team_in_cache(api_token)
if team is None:
    team = Team.objects.get(api_token=api_token)
    set_team_in_cache(api_token, team)

The cached team is serialized using CachingTeamSerializer and reconstructed as a Team instance on retrieval.

Cache TTL settings

Cache	Hit TTL	Miss TTL
HyperCache (default)	30 days	1 day
Remote config serving	1 day	1 day
Team authentication	5 days	N/A (deleted on miss)

Performance characteristics

Scenario	Latency	Database calls
Local evaluation (Redis hit)	~1-2ms	0 (but auth overhead)
Remote config (Redis hit)	~1ms	0
S3 fallback	~50-100ms	0
Database fallback	~200-500ms	1+

Prometheus metrics

Metric	Labels	Purpose
`posthog_hypercache_get_from_cache`	`result`, `namespace`, `value`	Cache hit/miss tracking
`posthog_hypercache_sync`	`result`, `namespace`, `value`	Cache sync task outcomes
`posthog_hypercache_sync_duration_seconds`	`result`, `namespace`, `value`	Cache sync timing
`posthog_remote_config_via_cache`	`result`	Remote config cache performance

Result labels: hit_redis, hit_s3, hit_db, missing, batch_miss

Debugging

Check Redis cache

bash

# Local evaluation flags
redis-cli get "cache/teams/{team_id}/feature_flags/flags_with_cohorts.json"

# Remote config
redis-cli get "remote_config/{api_token}/config"

# Team authentication
redis-cli get "team_token:{api_token}"

Check cache source in responses

Local evaluation responses include cache source information via Prometheus metrics. Check the posthog_hypercache_get_from_cache metric with the appropriate labels.

Force cache refresh

python

from posthog.models.feature_flag.local_evaluation import update_flag_caches
from posthog.models.team import Team

team = Team.objects.get(id=123)
update_flag_caches(team)

Common issues

Symptom	Likely cause	Solution
Stale data after flag change	Signal not firing	Check transaction.on_commit is used
Cache misses in production	Redis connection issues	Check Redis connectivity and metrics
S3 fallback errors	Object storage misconfigured	Verify OBJECT_STORAGE_ENABLED setting
ETag mismatches	Non-deterministic JSON serialization	HyperCache uses `sort_keys=True`

Dedicated flags Redis

The feature-flags Rust service can use a separate Redis instance for caching, isolated from the shared Django cache. This prevents flag cache operations from affecting other cache users.

Enabling dedicated Redis

bash

FLAGS_REDIS_URL=redis://flags-redis:6379  # Separate instance for flags

When FLAGS_REDIS_URL is set, the system uses a dual-write pattern:

python

# posthog/caching/flags_redis_cache.py
def write_flags_to_cache(key: str, value: Any, timeout: Optional[int] = None) -> None:
    # Always write to shared cache (Django reads from here)
    cache.set(key, value, timeout)

    # Also write to dedicated cache if configured (Rust service reads from here)
    if has_dedicated_cache:
        dedicated_cache = caches[FLAGS_DEDICATED_CACHE_ALIAS]
        dedicated_cache.set(key, value, timeout)

Why dual-write?

Consumer	Reads from	Purpose
Django	Shared cache	Local evaluation, SDK endpoints
Rust service	Dedicated cache	High-throughput flag evaluation

The dual-write pattern is temporary while the Rust port is being completed. Once the Rust service handles all flag evaluation, Django will stop writing to the shared cache for local evaluation, and only the dedicated cache will be used.

The Rust service only operates when FLAGS_REDIS_URL is configured. All cache update functions check this setting and skip operations if not set.

Scheduled tasks

Cache freshness is maintained through scheduled Celery tasks.

Task	Schedule	Purpose
`refresh_expiring_flags_cache_entries`	Hourly at :15	Refresh caches with TTL < 24h before they expire
`cleanup_stale_flags_expiry_tracking_task`	Daily at 3:15 AM	Remove expired team entries from tracking sorted set
`verify_and_fix_flag_definitions_cache_task`	Hourly at :50	Verify flag definitions cache (with cohorts) against database
`verify_and_fix_flag_definitions_without_cohorts_cache_task`	Hourly at :10	Verify flag definitions cache (without cohorts) against database

Refresh task

The hourly refresh job prevents cache misses by proactively refreshing entries before they expire:

python

# posthog/tasks/feature_flags.py
@shared_task
def refresh_expiring_flags_cache_entries():
    successful, failed = refresh_expiring_flags_caches(
        ttl_threshold_hours=settings.FLAGS_CACHE_REFRESH_TTL_THRESHOLD_HOURS,  # Default: 24
        limit=settings.FLAGS_CACHE_REFRESH_LIMIT,  # Default: 5000
    )

The task uses a Redis sorted set (flags_cache_expiry) to efficiently find expiring entries without scanning all keys.

Verification tasks

Two independent verification tasks compare cached flag definitions against the database and fix discrepancies — one for the with-cohorts variant (runs at :50) and one for the without-cohorts variant (runs at :10). Each task:

Acquires its own distributed lock (so one variant can't block the other)
Samples teams from the cache
Compares cached flags to current database state
Auto-fixes mismatches by refreshing the cache
Reports metrics on match/mismatch/miss rates

Each task has a 25-minute soft / 30-minute hard time limit, since each only handles one variant.

Configuration:

bash

FLAGS_CACHE_VERIFICATION_GRACE_PERIOD_MINUTES=5  # Skip recently updated flags

For initial cache build

Scheduled tasks only maintain existing caches. For initial population or schema migrations, use the management command:

bash

python manage.py warm_flags_cache [--invalidate-first] [--team-ids ID1 ID2 ...]

By default, the command warms caches only for teams that have ever had a feature flag (using config.get_teams_queryset()). This includes teams whose flags have all been soft-deleted (so the cache correctly contains an empty flags list). Teams that have never had any feature flags are skipped to avoid unnecessary database queries and Redis writes. When --team-ids is provided, those specific teams are warmed regardless of the config scoping.

Signal handlers

Django signals automatically invalidate the cache when models change.

Models that trigger cache updates

Model	Signal	Action
`FeatureFlag`	post_save	Refresh team's flags cache
`FeatureFlag`	post_delete	Refresh team's flags cache
`Team`	post_save	Warm cache for new team
`Team`	post_delete	Clear team's flags cache
`FeatureFlagEvaluationTag`	post_save	Refresh team's flags cache
`FeatureFlagEvaluationTag`	post_delete	Refresh team's flags cache
`Tag`	post_save	Refresh caches for teams using tag

Transaction safety

All signal handlers use transaction.on_commit() to avoid race conditions:

python

# posthog/models/feature_flag/flags_cache.py
@receiver([post_save, post_delete], sender=FeatureFlag)
def feature_flag_changed_flags_cache(sender, instance, **kwargs):
    transaction.on_commit(lambda: update_team_service_flags_cache.delay(instance.team_id))

This ensures the cache update task runs after the database transaction commits, not before.

Cohort invalidation

The local evaluation cache (for SDKs) also invalidates when cohorts change. See posthog/models/feature_flag/local_evaluation.py for the full signal handler list.

Configuration

bash

# Required
REDIS_URL=redis://localhost:6379

# Dedicated flags Redis (optional, enables dual-write)
FLAGS_REDIS_URL=redis://flags-redis:6379

# Cache TTL settings
FLAGS_CACHE_TTL=604800             # 7 days (default)
FLAGS_CACHE_MISS_TTL=86400         # 1 day (default)

# Scheduled task settings
FLAGS_CACHE_REFRESH_TTL_THRESHOLD_HOURS=24  # Refresh caches expiring within 24h
FLAGS_CACHE_REFRESH_LIMIT=5000              # Max teams per refresh run
FLAGS_CACHE_VERIFICATION_GRACE_PERIOD_MINUTES=5  # Skip recently updated flags

# For S3 fallback
OBJECT_STORAGE_ENABLED=true
AWS_S3_BUCKET_NAME=posthog-cache

# Remote config CDN purge (optional)
REMOTE_CONFIG_CDN_PURGE_ENDPOINT=https://api.cloudflare.com/...
REMOTE_CONFIG_CDN_PURGE_TOKEN=...
REMOTE_CONFIG_CDN_PURGE_DOMAINS=["cdn.example.com"]

posthog/storage/hypercache.py - Core HyperCache implementation
posthog/models/feature_flag/local_evaluation.py - Local evaluation caching
posthog/models/feature_flag/flags_cache.py - Flags cache, signal handlers, verification, dependency computation
posthog/storage/hypercache_manager.py - Batch management operations (warm, invalidate, stats)
posthog/caching/flags_redis_cache.py - Dual-write pattern for dedicated Redis
posthog/models/remote_config.py - Remote config caching
posthog/models/team/team_caching.py - Team authentication caching
posthog/tasks/feature_flags.py - Cache update and refresh Celery tasks
posthog/tasks/hypercache_verification.py - Cache verification task
posthog/tasks/remote_config.py - Remote config sync tasks
posthog/tasks/scheduled.py - Task schedule definitions
posthog/tasks/hypercache_verification.py - Cache verification tasks

HyperCache System

HyperCache System

Architecture overview

Core components

HyperCache class

Creating an instance

Cache key formats

Key methods

ETag support

Service cache (Rust)

Cache instance

Cache payload structure

Dependency computation

Local evaluation caching

Cache invalidation

Remote config caching

Cache lookup flow

No authentication

CDN integration

Team authentication caching

Cache format

Usage

Cache TTL settings

Performance characteristics

Prometheus metrics

Debugging

Check Redis cache

Check cache source in responses

Force cache refresh

Common issues

Dedicated flags Redis

Enabling dedicated Redis

Why dual-write?

Scheduled tasks

Refresh task

Verification tasks

For initial cache build

Signal handlers

Models that trigger cache updates

Transaction safety

Cohort invalidation

Configuration

Related files

See also