src/cls/sem_set/DESIGN.md
The Cause: The 'window' optimization creates a situation where some changes exist only in RAM until a timer expires. If an RGW crashes after a write, the data log entry might never be made.
RGWDataChangesLog: The class implementing the datalog
functionality.cur_cycle: The current working set of entries that will be written
to FIFO at the close of the window.add_entry: The function in the datalog API to write an entry.renew_entries: Function run periodically to write an entry to FIFO
for every bucketshard in cur_cycle.recover: Proposed function to complete initiated but incomplete datalog writes.add_entry a transaction.cur_cycle on the OSD in the semaphore object.recover function to be run at startup.add_entry is called with bs1:
cur_cycle does not contain bs1, increment bs1's count in
the semaphore objectrenew_entries, after the FIFO entry for a given shard has been
written, decrement the count associated with that bucket shard on
the semaphore object.recover:
unordered_map).cur_cycle as the response.cur_cycle, bs1 will be decremented
thrice)notify operation errors, don't decrement anything.compress on a regular basis (Daily? Hourly?),
to keep seldom used or deleted bucket shards from slowing down
listingadd_entry and
renew_entries within a single process and between RGWs on
different hostsrecover
and other RGWs, ensuring that we won't delete someone else's
transaction.add_entry.increment/decrement take a vector of bucketshards so they're
ready to support batching.Do we want to shard the semaphore object?
What happens when we delete buckets? Would we need to clean up the asociated bucketshards and delete them?
Is a separate compress function really necessary? It might be
worth doing test runs between a version that just deletes inline and
one that uses a separate compress step.