docs/RFCS/20160218_freeze.md
This RFC outlines the plan for freezing our data formats and network protocols.
We currently make backwards-incompatible changes to data formats without providing any means to upgrade without data loss. This will need to stop before beta for obvious reasons.
The freeze will proceed in several steps.
Anything goes; changes to on-disk formats do not require any kind of migration path.
In stage 1, we require backwards-compatibility with data written by any previous stage 1 build. It should always be possible to upgrade by stopping all of the old nodes and then bringing up the new version. It's OK at this stage if old and new versions cannot be run concurrently, or if the migration process takes some time (e.g. rewriting all data before the node can start up).
It is acceptable at this stage if the process is somewhat manual (e.g. running some sort of yet-to-be-written backup/restore process and stopping/restarting all nodes at once). However, it is preferable if any migrations are done automatically when a node starts up with an old data directory.
Beginning in stage 2, we require that any upgrade be able to be performed without taking the cluster offline: old and new nodes must be able to coexist. The exact date for this stage is yet to be determined, but will be during the beta period and before 1.0.
Any code could potentially be affected by the freeze, but areas that will deserve special scrutiny include:
.proto definitionskeys and util/encodingsql/system.go)It is difficult to come up with a universal migration strategy, since
different changes will require different approaches (for example,
.proto changes could perhaps be made by rewriting data on disk at
startup, while changes to key construction may require the change to
be coordinated in a distributed fashion). Therefore we leave the
specifics of a migration process until the need arises.
To facilitate future changes, we will introduce version numbers at several levels. Initially the behavior around these version numbers will be conservative and cross-version communication will be limited. That makes these version numbers a blunt instrument to be reserved for major changes.
NodeDescriptor proto). The rebalance/allocation system will not
choose to place a replica on a node with a different version number.TableDescriptorAfter the freeze, some changes will be much harder to make.
None.