docs/subsystems/database-concurrency.md
When a transaction does a read-then-write sequence where correctness depends on the state it read, it needs a row lock to serialize that operation with other writers. Without a lock, another transaction can commit between the read and write, making the transaction have stale information. Two requests/workers can then apply conflicting updates.
Django exposes a way to control the appropriate row locks via
QuerySet.select_for_update(). This produces
either SELECT ... FOR NO KEY UPDATE or SELECT ... FOR UPDATE queries,
depending on the no_key boolean argument passed to select_for_update(). The
key decision at each call site is choosing the appropriate lock strength.
select_for_update() issues a SELECT ... FOR ... query that locks the
selected rows until the surrounding transaction commits or rolls back. A
transaction trying to take a conflicting lock on those rows will block.
The behavior can be modified to either fail immediately if the lock can't be taken
or skip the rows we cannot lock, via the nowait=True and skip_locked=True arguments,
respectively.
This is the right tool for code paths where correctness depends on reading a stable value and then writing based on that value in the same transaction. This protection only works if other code paths that mutate the same data also use appropriate row locks.
Additionally, select_for_update() should be used to ensure a consistent
order of lock acquisition to prevent deadlocks. In particular, for these two
example classes of situations:
id=1, id=2 and the second transaction
locks them in the order id=2, id=1, they deadlock with the first
transaction waiting for the lock on id=2 to be released, while the second
transaction waits for the lock on id=1. By using select_for_update() with
a consistent order_by(), we can ensure both transactions acquire locks in
the same order, preventing such deadlocks.Message row, followed by some related UserMessage
rows, while the second transaction updates the UserMessage rows, followed
by the Message row, they can deadlock. The first transaction can end up
stuck waiting for the lock on UserMessage rows, while the second
transaction waits for the Message row lock. To prevent this, both
transactions should use select_for_update() on the intended Message at
the beginning - ensuring they execute serially and preventing the deadlock
risk.Because the lock lives for the duration of the transaction, we should keep the
transaction.atomic() block small once the lock is acquired.
As a project policy, Zulip requires an explicit no_key= kwarg on every
select_for_update() call, enforced by semgrep.
no_key)Django's select_for_update(no_key=...) maps to PostgreSQL row-level locks:
no_key=True takes a FOR NO KEY UPDATE lock.no_key=False takes a FOR UPDATE lock.FOR UPDATE is stronger. In particular, it conflicts with the FOR KEY SHARE lock,
which PostgreSQL acquires for foreign-key checks when inserting/updating
referencing rows. Therefore, FOR UPDATE on a parent row will block inserts in other
tables that reference that row.
FOR NO KEY UPDATE still protects against concurrent writes to the locked row,
but does not block FOR KEY SHARE. This avoids unnecessary cross-table
contention and is usually the right default, unless the transaction has the possibility
of deleting some of the selected rows.
See PostgreSQL's row-level lock documentation for further detail.
no_key=True vs no_key=FalseUse no_key=True when:
Use no_key=False when:
If using no_key=False, add a short comment explaining why the stronger lock is
required.
Typical update path (no_key=True):
with transaction.atomic():
row = Model.objects.select_for_update(no_key=True).get(id=row_id)
row.some_number = row.some_number + 1
row.save(update_fields=["some_number"])
Delete path (no_key=False):
with transaction.atomic():
row = Model.objects.select_for_update(no_key=False).get(id=row_id)
maybe_delete_row(row)
When locking rows from a queryset with joins, use of=("self",) unless you
intentionally need to lock related tables too.
rows = Message.objects.select_related(*Message.DEFAULT_SELECT_RELATED) \
.select_for_update(of=("self",), no_key=True)