Back to Zulip

Database concurrency

docs/subsystems/database-concurrency.md

12.05.6 KB
Original Source

Database concurrency

select_for_update()

When a transaction does a read-then-write sequence where correctness depends on the state it read, it needs a row lock to serialize that operation with other writers. Without a lock, another transaction can commit between the read and write, making the transaction have stale information. Two requests/workers can then apply conflicting updates.

Django exposes a way to control the appropriate row locks via QuerySet.select_for_update(). This produces either SELECT ... FOR NO KEY UPDATE or SELECT ... FOR UPDATE queries, depending on the no_key boolean argument passed to select_for_update(). The key decision at each call site is choosing the appropriate lock strength.

What select_for_update() does

select_for_update() issues a SELECT ... FOR ... query that locks the selected rows until the surrounding transaction commits or rolls back. A transaction trying to take a conflicting lock on those rows will block. The behavior can be modified to either fail immediately if the lock can't be taken or skip the rows we cannot lock, via the nowait=True and skip_locked=True arguments, respectively.

This is the right tool for code paths where correctness depends on reading a stable value and then writing based on that value in the same transaction. This protection only works if other code paths that mutate the same data also use appropriate row locks.

Additionally, select_for_update() should be used to ensure a consistent order of lock acquisition to prevent deadlocks. In particular, for these two example classes of situations:

  • If two concurrent transactions attempt to update overlapping sets of rows, inconsistent ordering can lead to a deadlock. For example, if the first transaction locks rows in the order id=1, id=2 and the second transaction locks them in the order id=2, id=1, they deadlock with the first transaction waiting for the lock on id=2 to be released, while the second transaction waits for the lock on id=1. By using select_for_update() with a consistent order_by(), we can ensure both transactions acquire locks in the same order, preventing such deadlocks.
  • If two concurrent transactions update rows in two tables in an inconsistent order, they can deadlock in a similar way. For example, if the first transaction updates a Message row, followed by some related UserMessage rows, while the second transaction updates the UserMessage rows, followed by the Message row, they can deadlock. The first transaction can end up stuck waiting for the lock on UserMessage rows, while the second transaction waits for the Message row lock. To prevent this, both transactions should use select_for_update() on the intended Message at the beginning - ensuring they execute serially and preventing the deadlock risk.

Because the lock lives for the duration of the transaction, we should keep the transaction.atomic() block small once the lock is acquired.

As a project policy, Zulip requires an explicit no_key= kwarg on every select_for_update() call, enforced by semgrep.

Lock strengths (no_key)

Django's select_for_update(no_key=...) maps to PostgreSQL row-level locks:

  • no_key=True takes a FOR NO KEY UPDATE lock.
  • no_key=False takes a FOR UPDATE lock.

FOR UPDATE is stronger. In particular, it conflicts with the FOR KEY SHARE lock, which PostgreSQL acquires for foreign-key checks when inserting/updating referencing rows. Therefore, FOR UPDATE on a parent row will block inserts in other tables that reference that row.

FOR NO KEY UPDATE still protects against concurrent writes to the locked row, but does not block FOR KEY SHARE. This avoids unnecessary cross-table contention and is usually the right default, unless the transaction has the possibility of deleting some of the selected rows.

See PostgreSQL's row-level lock documentation for further detail.

Choosing no_key=True vs no_key=False

Use no_key=True when:

  • You are updating fields other than FK-referenced key columns (in Zulip today, this is effectively just the row's primary key column).
  • You are serializing access to prevent lost updates/races.
  • The row itself will not be deleted and those key columns are not changing.

Use no_key=False when:

  • You will delete the locked row (or call helper code that may delete it).
  • You will change key columns that are referenced by foreign keys (as noted above, in Zulip today, this is just the row's primary key column).
  • You intentionally need to block FK-reference creation to that row while the transaction is in progress.

If using no_key=False, add a short comment explaining why the stronger lock is required.

Example patterns

Typical update path (no_key=True):

python
with transaction.atomic():
    row = Model.objects.select_for_update(no_key=True).get(id=row_id)
    row.some_number = row.some_number + 1
    row.save(update_fields=["some_number"])

Delete path (no_key=False):

python
with transaction.atomic():
    row = Model.objects.select_for_update(no_key=False).get(id=row_id)
    maybe_delete_row(row)

When locking rows from a queryset with joins, use of=("self",) unless you intentionally need to lock related tables too.

python
rows = Message.objects.select_related(*Message.DEFAULT_SELECT_RELATED) \
    .select_for_update(of=("self",), no_key=True)