packages/data-schemas/misc/ferretdb/ferretdb-multitenancy-plan.md
Database-per-org data isolation using FerretDB (PostgreSQL-backed) with horizontal sharding across multiple FerretDB+Postgres pairs. MongoDB and AWS DocumentDB are not options.
FerretDB with postgres-documentdb does not create separate PostgreSQL schemas per MongoDB database. All data lives in a single documentdb_data PG schema:
documents_<id> + retry_<id> table pairdocumentdb_api_catalog.collections and .collection_indexesmongoose.connection.useDb('org_X') creates a logical database in DocumentDB's catalogImplication: No PG-level schema isolation, but logical isolation is enforced by FerretDB's wire protocol layer. Backup/restore must go through FerretDB, not raw pg_dump.
All 29 LibreChat Mongoose models and 98 custom indexes work on FerretDB v2.7.0:
| Index Type | Count | Status |
|---|---|---|
| Sparse + unique | 9 (User OAuth IDs) | Working |
| TTL (expireAfterSeconds) | 8 models | Working |
| partialFilterExpression | 2 (File, Group) | Working |
| Compound unique | 5+ | Working |
| Concurrent creation | All 29 models | No deadlock (single org) |
| Orgs | Collections | Catalog Indexes | Data Tables | pg_class | Init/org | Query avg | Query p95 |
|---|---|---|---|---|---|---|---|
| 10 | 450 | 1,920 | 900 | 5,975 | 501ms | 1.03ms | 1.44ms |
| 50 | 1,650 | 7,040 | 3,300 | 20,695 | 485ms | 1.00ms | 1.46ms |
| 100 | 3,150 | 13,440 | 6,300 | 39,095 | 483ms | 0.83ms | 1.13ms |
Key finding: Init time and query latency are flat through 100 orgs. No degradation.
User model (11+ indexes) vs zero-index collection: 1.11x — only 11% overhead. DocumentDB's JSONB index management is efficient.
Tenant router proven with:
req.getModel('User')) works transparently| Org Count | Postgres Instances | Notes |
|---|---|---|
| 1–300 | 1 | Default config |
| 300–700 | 1 | Tune autovacuum, PgBouncer, shared_buffers |
| 700–1,000 | 1-2 | Split when monitoring signals pressure |
| 1,000+ | N / ~500 each | One FerretDB+Postgres pair per ~500 orgs |
createIndexes()retryWithBackoff utility with exponential backoff + jitterinitializeOrgCollections and migrateOrg scriptsretry_4 hit a deadlock on createIndexes(User), recovered via backoff (1,839ms total)retry_5 also hit retry path (994ms vs ~170ms clean)packages/data-schemas/src/utils/retryWithBackoff.tsmongodump/mongorestore CLI not available — tested programmatic driver-level approachlistCollections() → find({}).toArray() per collection → in-memory OrgBackup structcollection.insertMany(docs) per collection into fresh org database_id values, field values, document counts match exactlycreateIndexes() is idempotent — re-init took 86ms with 12 indexes unchangedAuditLog collection with 4 indexes to 5 orgs — 109ms total{username:1, createdAt:-1} index to users across 5 orgs — 22ms total| File | Purpose |
|---|---|
packages/data-schemas/src/methods/multiTenancy.ferretdb.spec.ts | 5-phase benchmark (useDb mapping, indexes, scaling, write amp, shared collection) |
packages/data-schemas/src/methods/sharding.ferretdb.spec.ts | Sharding PoC (router, assignment, isolation, middleware pattern) |
packages/data-schemas/src/methods/orgOperations.ferretdb.spec.ts | Production operations (backup/restore, migration, deadlock retry) |
packages/data-schemas/src/utils/retryWithBackoff.ts | Production retry utility |
| File | Purpose |
|---|---|
docker-compose.ferretdb.yml | Single FerretDB + Postgres (dev/test) |
The retryWithBackoff utility was exercised under real FerretDB load. Key observations:
| Scenario | Attempts | Total Time | Notes |
|---|---|---|---|
| Clean org init (no contention) | 1 | 165-199ms | Most orgs complete in one shot |
| Deadlock on User indexes | 2 | 994ms | Single retry recovers cleanly |
| Deadlock with compounding retries | 2-3 | 1,839ms | Worst case in 5-org sequential batch |
The User model (11+ indexes including 9 sparse unique) is the most deadlock-prone collection. The retry utility's exponential backoff with jitter (100ms base, 10s cap) handles this gracefully.
Tested with a realistic org containing 4 populated collections:
| Operation | Time | Details |
|---|---|---|
| Backup (full org) | 24ms | 8 docs across 29 collections (25 empty) |
| Restore (to new org) | 15ms | Including insertMany() for each collection |
| Index re-creation | ~500ms | Separate initializeOrgCollections call |
Round-trip verified:
_id (ObjectId) preserved exactlycreatedAt / updatedAt (Date) preservedFor larger orgs (thousands of messages/conversations), backup time scales linearly with document count. The bottleneck is network I/O to FerretDB, not serialization.
| Operation | Time | Per Org |
|---|---|---|
| Idempotent re-init (no changes) | 86ms | 86ms |
| New collection + 4 indexes | 109ms | 22ms/org |
| New compound index on users | 22ms | 4.4ms/org |
| Full migration sweep (29 models) | 439ms | 88ms/org |
Migration is safe to run while the app is serving traffic — createIndexes and createCollection are non-blocking operations that don't lock existing data.
retry_1: 193ms (29 models) — clean
retry_2: 199ms (29 models) — clean
retry_3: 165ms (29 models) — clean
retry_4: 1839ms (29 models) — deadlock on User indexes, recovered
retry_5: 994ms (29 models) — deadlock on User indexes, recovered
Total: 3,390ms for 5 orgs (678ms avg, but 165ms median)
Use initializeOrgCollections() from packages/data-schemas/src/utils/retryWithBackoff.ts for all new org setup. Process orgs in batches of 10 with Promise.all() to parallelize across pools while minimizing per-pool contention.
Implement driver-level backup (not mongodump):
listCollections()find({}).batchSize(1000) for large collectionsinsertMany() in batches of 1,000Run migrateAllOrgs() as a deployment step:
createCollection(), createIndexesWithRetry()createIndexes() is idempotent — safe to re-runTrack per-org provisioning and migration times. If the median provisioning time rises above 500ms/org, investigate PostgreSQL catalog pressure:
pg_stat_user_tables.n_dead_tup for autovacuum healthpg_stat_bgwriter.buffers_backend for buffer pressuredocumentdb_api_catalog.collections count for total table count