doc/developer/foundationdb.md
Materialize supports using FoundationDB as a metadata storage layer. This document outlines how we integrate with FoundationDB.
FoundationDB offers a key-value interface where keys and values are arbitrary byte strings. Keys follow a directory structure, allowing for hierarchical organization of data. FoundationDB provides ACID transactions, ensuring data integrity and consistency.
FoundationDB requires some setup to the host system. Please refer to the official FoundationDB documentation for installation instructions.
LIBRARY_PATH=/usr/local/lib to link againt the FoundationDB client library.libfdb_c unconditionally.
On other systems, enable the fdb feature to link against FoundationDB.Materialize can use FoundationDB to store consensus and timestamp oracle data, the only two metadata components required for Materialize.
Consensus offers an interface where keys are mapped to values identified by sequence numbers. Sequence numbers are integers that increase monotonically, ensuring that updates to a key are ordered. We map the consensus structure to the following schema in FoundationDB:
./data/<key>/<sequence_number> -> <value>
data directory contains entries for each key and sequence number pair, mapping to the corresponding value.
The latest sequence number for a key is determined by a reverse scan of the data entries, which ensures locality between the current sequence number lookup and the actual data.keys directory contains entries for each key to simplify listing all keys:
./keys/<key> -> []
The timestamp oracle tracks the read and write timestamps for timelines. We map the timestamp oracle structure to the following schema in FoundationDB:
./<timeline>/read_ts -> <timestamp>
read_ts entry maps to the latest read timestamp../<timeline>/write_ts -> <timestamp>
write_ts entry maps to the latest write timestamp.Set the EXTERNAL_METADATA_STORE to foundationdb to force all compatible tests to use FoundationDB as the metadata store.
FoundationDB has a peculiar way to handle its lifecycle. According to the documentation, one needs to initialize the network once per process. Before shutdown, one needs to stop the network. A stopped network cannot be restarted.
For Materialize itself (environmentd, clusterd) doesn't have a clean exit path and always terminates without running exit handlers. This means that we don't need to worry about stopping the network cleanly. However, tests do have a clean exit path, and we need to ensure that we either stop the network, or terminate without stopping the network.
Rust tests do not have a notion of global setup and teardown, which makes it difficult to manage the FoundationDB network lifecycle.
To work around this, we use the ctor crate to terminate the process without dropping the network and without running destructors.
Specifically, this means:
mz-foundationdb with the shutdown feature.
This ensures that we register an exit handler that terminates immediately.