Back to Bytebase

SaaS Migration Progress

docs/plans/saas/05.progress.md

3.17.112.5 KB
Original Source

SaaS Migration Progress

Current status of the workspace isolation implementation. 185+ files changed, ~4700 lines modified.


Completed

Database Migration

  • server_config table — Global config (single row), stores auth_secret via payload JSONB
  • workspace table — Workspace identity with resource_id PK
  • workspace column added to 13 root tables + oauth2_client
  • Workspace-scoped unique indexes for tables with non-globally-unique PKs
  • oauth2_authorization_code / oauth2_refresh_token — No workspace column (scoped through client_id FK)
  • Auth secret migrated from SystemSetting JSON to server_config.payload
  • SystemSetting protoauth_secret and workspace_id fields reserved
  • LATEST.sqlplan, task, task_run, plan_check_run, issue changed from serial to bigint NOT NULL
  • LATEST.sqlworksheet.id column dropped (PK is resource_id)
  • LATEST.sql — Audit log composite index (workspace, created_at DESC) replaces separate indexes
  • LATEST.sql — Seed data: only server_config (workspace created by Go signup flow)
  • Migration 0006 — Drops worksheet id column (fixes "null value in column id" error)
  • Migration 0009 — Composite audit log index created

Store Layer

  • All root tables enforce workspace on CRUD (INSERT, SELECT, UPDATE, DELETE)
  • workspace.go: GetWorkspaceID() returns ("", nil) when no workspace exists, ListWorkspacesByEmail(), FindWorkspace(), CreateWorkspace() with proto-based settings/policy
  • runner_queries.go: Cross-workspace methods (GetProjectByResourceID, GetInstanceByResourceID, ListAllInstances, DeleteExpiredExportArchivesAll)
  • account.go: AccountMessage + GetAccountByEmail() for auth layer
  • GetServerConfig() / GetAuthSecret() — reads from server_config table
  • CountActiveEndUsersPerWorkspace — counts via IAM policy membership (explicit members only, not allUsers)
  • CountAllActivePrincipals — counts all principals globally (for allUsers admin check)
  • GetDefaultProjectID(ctx, workspace) — checks new format first, falls back to legacy "default"
  • BatchGetUsersByEmails — workspace-scoped via IAM join
  • Cache fixessettingCache, groupCache, groupMembersCache, memberGroupsCache all include workspace in keys. UpdatePolicy sets Workspace on cached entry. DeletePolicy invalidates iamPolicyCache. UpdateUserReferenceInPolicies invalidates iamPolicyCache and purges group caches.
  • SearchAuditLogs — workspace filter is optional (empty = cross-workspace for login lockout)

Auth & Token Flow

  • Workspace in JWT claims (WorkspaceID field)
  • Auth middleware injects workspace into context, verifies membership
  • Login workspace resolution via ListWorkspacesByEmail (IAM policy). Workspace resolved early in Login flow, passed as parameter (not via context).
  • Signup API (POST /v1/auth/signup, allow_without_credential) — Creates principal + workspace. Self-hosted: joins existing workspace or creates if none exists. SaaS: creates new workspace per user.
  • CreateUser refactored — Admin-only (bb.users.create permission), self-hosted only. SaaS blocked.
  • Login lockoutcountRecentLoginFailures searches audit logs cross-workspace (per-email rate limit)
  • IDP (SSO) login — resolves workspace from idp.Workspace, passes workspace explicitly (no context injection)
  • ExchangeToken — uses GetWorkloadIdentityByEmail (cross-workspace) and wi.Workspace
  • All allow_without_credential APIs audited — no GetWorkspaceIDFromContext in unauthenticated paths

API Layer

  • All auth_service functions accept workspaceID as parameter (not from context): validateLoginPermissions, checkMFARequired, needResetPassword, isUserWorkspaceAdmin, validateEmailWithDomains, syncUserGroups, generateLoginToken
  • UpdateUser — SaaS: self-updates only. Self-hosted: admin can update others.
  • DeleteUser / UndeleteUser / UpdateEmail — SaaS: blocked.
  • GetUser / BatchGetUsers — workspace-scoped via IAM join
  • SetIamPolicy — blocks allUsers in SaaS. No member count check (enforced at CreateUser/IDP level).
  • ListIdentityProviders — self-hosted falls back to store.GetWorkspaceID for unauthenticated login page
  • GetSubscription — falls back to free subscription if no workspace context
  • ACL simplified — removed rawResource struct, getResourceFromRequest returns []string
  • Default projectCreateProject blocks "default" and "default-*" project IDs as reserved. IsDefaultProject handles both legacy and new formats.
  • userCountGuard — uses explicit workspaceID parameter

Background Runners

  • Runners use cross-workspace store methods
  • Workspace resolved from loaded entities
  • CheckReplicaLimit skips in SaaS mode
  • Schema syncCreateDatabaseDefault uses GetDefaultProjectID (handles legacy)

OAuth2

  • oauth2_client workspace-scoped
  • Workspace-scoped URLs: /api/workspaces/:workspaceID/oauth2/...
  • Legacy routes preserved for self-hosted (!profile.SaaS)

Infrastructure

  • Server startup — No workspace dependency. Loads settings if workspace exists, skips on first boot.
  • server_config — Only auth_secret (JSONB payload)
  • Self-hosted features guarded by !profile.SaaS
  • DisallowSignup override removed — SaaS no longer forces it true (signup always allowed)
  • Per-workspace licensingLoadSubscription, StoreLicense, GetUserLimit, GetActivatedInstanceLimit all workspace-scoped with workspace in cache key

Frontend

  • isDefaultProject — reads from useActuatorV1Store().serverInfo?.defaultProject internally. Single-arg call.
  • Signup API — frontend uses authServiceClientConnect.signup() instead of CreateUser + Login
  • Auth interceptor — skips token refresh for Signup method
  • DEFAULT_PROJECT_PREFIX — unexported, used only internally by getDefaultProjectName

SaaS vs Self-hosted Behavior Differences

See 06.saas-vs-selfhosted.md for a comprehensive catalog of every code path difference.


Workspace Isolation Audit (Code Review)

A full code review was conducted to verify workspace isolation across the entire backend. All previously flagged API gaps were investigated and found to be properly isolated.

Fixes Applied

#FixFiles Changed
1ACL now validates instance-scoped resources (e.g. instances/{id}, instances/{id}/roles/{role}) by looking up the instance with workspace filterbackend/api/v1/acl.go
2Added missing Workspace param to GetDatabase/ListDatabases callsbackend/api/lsp/handler.go, backend/api/lsp/completion.go, backend/component/export/resources.go (4 calls), backend/component/sampleinstance/manager.go
3LSP handler refactored: user and workspaceID injected into context at WebSocket connection time, removed from handler struct. Now uses common.GetWorkspaceIDFromContext(ctx) consistentlybackend/api/lsp/lsp.go, backend/api/lsp/handler.go, backend/api/lsp/completion.go
4Added cross-workspace documentation comments to ACL populateRawResources (resource resolution strategy, per-case comments)backend/api/v1/acl.go
5db_schema.go GetDBSchema now JOINs to instance table and filters by instance.workspace when workspace param is providedbackend/store/db_schema.go

API Layer Verification

All previously flagged services were verified to properly enforce workspace isolation:

  • instance_role_service.go — uses getInstanceMessage helper which filters by workspace. ACL layer now also validates instance ownership.
  • changelog_service.goListChangelogs and GetChangelog look up the database with Workspace: common.GetWorkspaceIDFromContext(ctx). ACL's database case also validates.
  • database_group_service.go — all methods call store.GetProject with workspace filter before accessing database groups.
  • database_service.goGetDatabase, BatchGetDatabases, UpdateDatabase all use workspace-filtered lookups.
  • revision_service.go — validates database with workspace filter before querying revisions.
  • issue_service.gogetIssueFind() sets workspace from context; getIssueMessage() passes workspace.
  • plan_service.go — all GetPlan/ListPlans calls pass workspace.
  • rollout_service.go — all task/plan lookups pass workspace.
  • access_grant_service.goGetAccessGrant with workspace filter before any update.
  • release_service.go — project lookup with workspace filter before accessing releases.
  • worksheet_service.go — project lookup with workspace filter before accessing worksheets.

Store Layer Verification

Every public store method was audited:

  • Root tables (project, instance, policy, setting, role, group, etc.): All enforce workspace via direct WHERE workspace = ?.
  • Child tables (changelog, revision, worksheet, db_schema, etc.): Scoped via globally unique parent PKs. API layer validates workspace on the parent before calling these methods.
  • Global tables (principal, sheet_blob): Intentionally cross-workspace by design. Principals are global identities. Sheet blobs are content-addressed (SHA256 dedup).
  • All API-layer callers of optional-workspace store methods properly pass common.GetWorkspaceIDFromContext(ctx). Verified for GetDatabase, ListDatabases, GetIssue, ListIssues, GetPlan, ListPlans, ListTasks, ListTaskRuns.

Remaining

High Priority

  • Workspace switching — Issue new JWT with different workspace_id (login auto-selects the first workspace; users switch after login)
  • Invite users
  • ListWorkspacesByEmail must expand groups
  • User count limit in the SaaS mode

Medium Priority

  • Schema sync perf — when database count grows, consider querying per instance or adding InstanceIDs batch filter to FindDatabaseMessage

Low Priority

  • Workspace admin panel — Manage workspace settings, members

Impact on Existing Self-hosted Customers

Breaking Changes

  1. Signup API movedCreateUser (POST /v1/users) now requires bb.users.create permission. Self-service signup moved to Signup (POST /v1/auth/signup). API consumers using CreateUser for registration will break.
  2. CreateUser no longer adds to workspace IAM — Admin creates principal only. Must separately add to IAM for the user to login.

Non-breaking Changes

  1. Default project — New workspaces get default-{workspaceID}. Existing default project preserved. Code handles both formats.
  2. Workspace table + migration — Instant ADD COLUMN with default. No table rewrite.
  3. Login flow — Self-hosted with allUsers in IAM works identically to before.
  4. Cache fixes — Fixes potential stale cache bugs.
  5. Server startup — Resilient to missing workspace on first boot.

Design Decisions

  1. server_config only stores auth_secret — All other settings live in per-workspace WORKSPACE_PROFILE.

  2. Root table PKs (resource_id) are globally unique — No composite PK migration needed.

  3. Three access patterns for store queries:

    • API layer: common.GetWorkspaceIDFromContext(ctx) from JWT
    • Runners: Cross-workspace methods
    • Login/startup: store.GetWorkspaceID(ctx) or ListWorkspacesByEmail
  4. OAuth2 child tables don't need workspace — scoped through client_id FK.

  5. Default project naming — New workspaces: default-{workspaceID}. Legacy: default. IsDefaultProject and frontend handle both. No data migration needed.

  6. Signup vs CreateUser — Signup is self-service (both modes). CreateUser is admin-only (self-hosted). SaaS admins add members via workspace IAM policy.

  7. disallow_signup — Self-hosted only. Not applicable in SaaS (admin invites via IAM, signup always allowed for new workspace creation).

  8. allow_without_credential APIs — No GetWorkspaceIDFromContext in unauthenticated paths. Workspace resolved from entity data or store.GetWorkspaceID for self-hosted.