docs/RFCS/20220818_version_upgrades_for_virtual_clusters.md
This RFC proposes a mechanism inside CockroachDB to orchestrate the initialization of system tables, then upgrading the logical cluster version seen by VCs.
This aims to fix an active correctness bug, for which we so far use a kludgy workaround in CC Serverless, and which will block a generalization of cluster virtualization to all deployments.
<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->Table of Contents
The remainder of the RFC requires familiarity with the (previously defined) cluster version and upgrade mechanisms.
These are explained in an accompanying tech note.
In particular, the RFC depends on familiarity with the difference between:
storage logical version (SLV): the cluster version in the system tenant, which is also the cluster version used / stored in KV stores.
storage binary version (SBV): the executable version(s) used in KV nodes.
VC logical versions (TLV): the cluster version in each VC (there may be several, and they can also be separate from the cluster version in the system tenant).
VC binary version (TBV): the executable version used for SQL servers (which can be different from that used for KV nodes, #84700 notwithstanding).
as well as the required invariants between them.
We need to do work in this area because while we knew we needed to maintain invariants (as described above) with cluster virtualization, we actually failed to implement the code to check these invariants.
In particular:
it's trivially possible to violate invariant D, because we do not implement the version interlock in VCs.
So different SQL pods running at different binary versions can observe different cluster versions temporarily, and expose different feature sets (or even use system tables incorrectly). The TLV can even appear to move backward.
We do not have a tracking issue for this yet. A subset of the problem is covered in this issue https://github.com/cockroachdb/cockroach/issues/66606 .
we are missing a guardrail against violations of invariants E, F and G: nothing blocks TLV upgrades beyond the SLV currently.
This is tracked here: https://github.com/cockroachdb/cockroach/issues/80992
As a result of this lack of guardrails, it's possible to bring a cluster in an invalid state, where VCs are at a cluster version beyond the level of support they need from the KV layer.
This can cause serious UX pain and, in extreme case, outright data corruption.
In this proposal we are thinking about running multiple VCs inside the same process, possibly shared with the KV layer.
Does this simplify?
Alas, it does not: it only forces the TBV to remain equal to SBV, but it does not constraint the TLV to remain "behind" the SLV.
Additionally, it's also possible for different nodes (running different SQL servers for the "app" VC) to run at different TBVs, and the SQL layer for VCs does not properly persist its TLV, so it's still possible for the TLV to briefly appear to move backward.
So invariants D, E, F and G can still be violated.
So we still need guardrails with a shared-process deployment.
We are going to block a SQL server from starting if its TBV is too low for the current TLV.
We will do this with an assertion that verifies the version immediately after the settings watcher has loaded the initial version value from the storage cluster.
We are going to extend the interlock previously implemented without cluster virtualization to also work in VCs:
at all times, each SQL server will maintain an updated copy of the SLV of the
storage cluster it's connected to. (This will use the rangefeed which we already
exploit in settingswatcher to import settings into VCs.).
there will be a "migrate" function for VCs (or, alternatively,
an implementation of the Cluster interface for VCs),
i.e. an alternative implementation of (*upgrademanager.Manager).Migrate()
populated as VersionUpgradeHook in the ExecutorConfig for VCs.
(Note: we already have a go interface, upgrade.Cluster. We have an
implementation for VCs, but it does not do enough)
We will do our best to try to reuse the existing RPC code
(ValidateTargetClusterVersion, BumpClusterVersion) where
applicable.
As a side-car to the work above (but not strictly required), we can also work to remove the startup migration code.
https://github.com/cockroachdb/cockroach/issues/73813
This will also speed up the initialization of new VCs, and the start-up time of SQL servers.
None known.
None known.
"Do nothing" is not an option given the corruption risks.
We are extending the cluster version upgrade semantics to be the same in CC serverless tenants and (in the future) CC dedicated with cluster virtualization enabled.
N/A