docs/content/stable/architecture/transactions/read-committed.md
Read Committed is one of the three isolation levels in PostgreSQL, and also its default. A unique property of this isolation level is that, for transactions running with this isolation, clients do not need to retry or handle serialization errors (40001) in application logic. The other two isolation levels (Serializable and Repeatable Read) require applications to have retry logic for serialization errors. Also, each statement in a read committed transactions works on a new latest snapshot of the database, implying that any data committed before the statement was issued, is visible to the statement.
A read committed transaction in PostgreSQL doesn't raise serialization errors because it internally retries conflicting rows in the statement's execution as of the latest versions of those rows, as soon as conflicting concurrent transactions have finished. This mechanism allows single statements to work on an inconsistent snapshot (in other words, non-conflicting rows are read as of the statement's snapshot, but conflicting rows are re-attempted on the latest version of the row after the conflicting transactions are complete).
YugabyteDB's Read Committed isolation provides slightly stronger guarantees than PostgreSQL's read committed, while providing the same semantics and benefits, that is, a user doesn't have to retry serialization errors in the application logic (modulo limitation 2 around ysql_output_buffer_size which is not of relevance for most OLTP workloads). In YugabyteDB, a read committed transaction retries the whole statement instead of retrying only the conflicting rows. This leads to a stronger guarantee where each statement in a YugabyteDB read committed transaction always uses a consistent snapshot of the database, while in PostgreSQL an inconsistent snapshot can be used for statements when conflicts are present. For a detailed example, see Stronger guarantees in YugabyteDB's read committed isolation.
Note that retries for the statement in YugabyteDB's Read Committed isolation are limited to the per-session YSQL configuration parameter yb_max_query_layer_retries. To set it at the cluster level, use the ysql_pg_conf_csv TServer flag. If a serialization error isn't resolved within yb_max_query_layer_retries, the error will be returned to the client.
{{< tip title="Enable Read Committed" >}}
Read Committed is supported only if the YB-TServer flag yb_enable_read_committed_isolation is set to true.
For new universes running v2025.2 or later, Read Committed is enabled by default when you deploy using yugabyted, YugabyteDB Anywhere, or YugabyteDB Aeon.
In versions of YugabyteDB prior to v2025.2, or for manually deployed universes, yb_enable_read_committed_isolation is false (the default), and the Read Committed isolation level of YugabyteDB's transactional layer falls back to the stricter Snapshot isolation. This makes the default isolation level for the YSQL API effectively Snapshot because Read Committed, which is the YSQL API and PostgreSQL syntactic default, maps to Snapshot isolation.
Refer to Usage to start a Read Committed transaction after enabling the flag. {{< /tip >}}
The following two key semantics set apart Read Committed isolation from Repeatable Read in PostgreSQL (refer Read Committed level):
To handle serialization errors in the database without surfacing them to the client, PostgreSQL takes a number of steps based on the statement type.
If the subject row is being updated or deleted by other concurrent transactions in a conflicting way, wait for the conflicting transactions to commit or rollback, and then perform validation steps.
If the subject row has been updated or deleted by other concurrent transactions in a conflicting way, perform validation steps.
If the subject row has been locked by other concurrent transactions in a conflicting way, wait for them to commit or rollback, and then perform validation steps.
Note that two transactions are concurrent if their read time to commit time ranges overlap. If a transaction has not yet committed, the closing range is the current time. Also, for read committed isolation, there is a read time for each statement, and not one for the whole transaction.
The validation steps are as follows:
WHERE clause evaluates to true on the updated version of the row.ON CONFLICT DO UPDATE: if a conflict occurs, wait for the conflicting transactions to commit or rollback.
DO UPDATE part on the latest version of the row.ON CONFLICT DO NOTHING: if a conflict occurs, do not do anything.Note that the implementation in PostgreSQL (discussed above) can theoretically lead to two different visible semantics:
CREATE TABLE test (k int primary key, v int);
INSERT INTO test VALUES (2, 5);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
insert into test values (5, 5);
INSERT 0 1
update test set v=10 where k=2;
UPDATE 1
update test set v=100 where v>=5;
(waits)
COMMIT;
UPDATE 1
select * from test;
k | v
---+-----
5 | 5
2 | 100
(2 rows)
As seen above, the UPDATE from transaction 2 first picks the latest snapshot of the database which only has the row (2, 5). The row satisfies the UPDATE statement's WHERE clause and hence the transaction 2 tries to update the value of v from 5 to 100. However, due to an existing conflicting write from transaction 1, it waits for transaction 1 to end. After transaction 1 commits, it re-reads the latest version of only the conflicting row, and re-evaluates the WHERE clause. The clause is still satisfied by the new row (2, 10) and so the value is updated to 100. Note that the newly inserted row (5, 5) isn't updated even though it satisfies the WHERE clause of transaction 2's UPDATE, because it was not part of the snapshot originally picked by transaction 2's UPDATE statement.
So, to avoid serialization errors, PostgreSQL only retries the conflicting rows based on their latest versions, thereby allowing a single statement to run on an inconsistent snapshot. In other words, one snapshot is picked for the statement to read all data and process all non-conflicting rows, and a latest version is used for the conflicting rows.
The other degenerate scenario that can occur differs in the output of the UPDATE in transaction 2:
update test set v=10 where k=2;
UPDATE 1
update test set v=100 where v>=5;
(some processing before snapshot is picked, but feels like postgreSQL is waiting due to a conflict)
COMMIT;
UPDATE 2
select * from test;
k | v
---+-----
5 | 100
2 | 100
(2 rows)
The preceding outcome can occur via the following unlikely circumstance: until Client 1 commits, PostgreSQL on Client 2 for some reason is busy or slow, and hasn't yet picked a snapshot for execution. Only after Client 1 commits, transaction on Client 2 picks a snapshot based off the current time for the statement. This leads to both rows being read as part of the snapshot and updated without any observable conflicts.
Both the common case and unlikely outcomes are valid and satisfy the semantics of Read Committed isolation level. And theoretically, the user cannot figure out which one will be seen because the user cannot differentiate between a pause due to waiting for a conflicting transaction, or a pause due to the database just being busy or slow. Moreover, the unlikely case provides a stronger and more intuitive guarantee that the whole statement runs off a single snapshot.
These two possibilities show that the client cannot have application logic that relies on the expectation that the common case occurs always. YugabyteDB implements Read Committed isolation by undoing and retrying a statement whenever serialization errors occur. This provides a stronger guarantee that each statement always works off just a single snapshot, and no inconsistency is allowed even in case of a some conflicting rows. This leads to YugabyteDB always returning output similar to the second outcome in the preceding example which is also simpler to reason.
This might change in future as per #11573, if it gains interest.
Read Restart errors stem from clock skew which is inherent in distributed databases due to the distribution of data across more than one physical node. PostgreSQL doesn't require defining semantics around read restart errors in read committed isolation because it is a single-node database without clock skew.
In general, YugabyteDB has optimizations to resolve this error internally with best-effort before forwarding it to the external client. However, for Read Committed isolation, YugabyteDB gives a stronger guarantee: no read restart errors will be thrown to the external client except when a statement's output exceeds ysql_output_buffer_size (the size of the output buffer between YSQL and the external client which has a default of 256KB and is configurable). For most OLTP applications, this would hold always as response sizes are usually in this limit.
YugabyteDB chooses to provide this guarantee as most clients that use read committed with PostgreSQL don't have app-level retries for serialization errors. So, it helps to provide the same guarantee for read restart errors which are unique to distributed databases.
Read Committed isolation faces the following limitations if using Fail-on-Conflict instead of the default Wait-on-Conflict concurrency control policy:
yb_max_query_layer_retries YSQL configuration parameter will ensure that statements aren't stuck in deadlocks forever.The retries for serialization errors are done at the statement level. Each retry will use a newer snapshot of the database in anticipation that the conflicts might not occur. This is done because if the read time of the new snapshot is higher than the commit time of the earlier conflicting transaction T2, the conflicts with T2 would essentially be voided as T1's statement and T2 would no longer be "concurrent".
To use Read Committed isolation, ensure the YB-TServer flag yb_enable_read_committed_isolation=true; this maps the syntactic Read Committed isolation in YSQL to the Read Committed implementation in DocDB. (When set to false, syntactic Read Committed in YSQL is mapped to Snapshot isolation in DocDB, meaning it behaves as Repeatable Read.)
Assuming the flag has been set, you can start a Read Committed transaction in the following ways:
START TRANSACTION isolation level read committed [read write | read only];BEGIN [TRANSACTION] isolation level read committed [read write | read only];BEGIN [TRANSACTION]; SET TRANSACTION ISOLATION LEVEL READ COMMITTED;BEGIN [TRANSACTION]; SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED;Start by creating the table to be used in all of the examples, as follows:
CREATE TABLE test (k int primary key, v int);
TRUNCATE TABLE test;
INSERT INTO test VALUES (1, 5);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
select * from test where v=5;
k | v
---+---
1 | 5
(1 row)
insert into test values (2, 5);
INSERT 0 1
select * from test where v=5;
k | v
---+---
1 | 5
(1 row)
insert into test values (3, 5);
INSERT 0 1
select * from test where v=5;
k | v
---+---
1 | 5
3 | 5
(2 rows)
commit;
select * from test where v=5;
k | v
---+---
1 | 5
2 | 5
3 | 5
(3 rows)
commit;
TRUNCATE TABLE test;
INSERT INTO test VALUES (0, 5), (1, 5), (2, 5), (3, 5), (4, 1);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
insert into test values (5, 5);
INSERT 0 1
update test set v=10 where k=4;
UPDATE 1
delete from test where k=3;
DELETE 1
update test set v=10 where k=2;
UPDATE 1
update test set v=1 where k=1;
UPDATE 1
update test set k=10 where k=0;
UPDATE 1
update test set v=100 where v>=5;
(waits)
commit;
UPDATE 4
select * from test;
k | v
----+-----
5 | 100
1 | 1
10 | 100
4 | 100
2 | 100
(5 rows)
commit;
TRUNCATE TABLE test;
INSERT INTO test VALUES (0, 5), (1, 5), (2, 5), (3, 5), (4, 1);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
insert into test values (5, 5);
INSERT 0 1
update test set v=10 where k=4;
UPDATE 1
delete from test where k=3;
DELETE 1
update test set v=10 where k=2;
UPDATE 1
update test set v=1 where k=1;
UPDATE 1
update test set k=10 where k=0;
UPDATE 1
select * from test where v>=5 for update;
(waits)
commit;
k | v
----+----
5 | 5
10 | 5
4 | 10
2 | 10
(4 rows)
commit;
Insert a new key that has just been changed by another transaction, as follows:
TRUNCATE TABLE test;
INSERT INTO test VALUES (1, 1);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
update test set k=2 where k=1;
UPDATE 1
insert into test values (2, 1);
(waits)
commit;
ERROR: duplicate key value violates unique constraint "test_pkey"
rollback;
Insert a new key that has just been changed by another transaction, with ON CONFLICT:
TRUNCATE TABLE test;
INSERT INTO test VALUES (1, 1);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
update test set k=2 where k=1;
UPDATE 1
insert into test values (2, 1) on conflict (k) do update set v=100;
(waits)
commit;
INSERT 0 1
select * from test;
k | v
---+-----
2 | 100
(1 row)
commit;
Insert an old key that has been removed by another transaction, as follows:
TRUNCATE TALE test;
INSERT INTO test VALUES (1, 1);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
update test set k=2 where k=1;
UPDATE 1
insert into test values (1, 1);
(waits)
commit;
INSERT 0 1
select * from test;
k | v
---+---
1 | 1
2 | 1
(2 rows)
commit;
Insert an old key that has been removed by another transaction, with ON CONFLICT:
TRUNCATE TABLE test;
INSERT INTO test VALUES (1, 1);
begin transaction isolation level read committed;
begin transaction isolation level read committed;
update test set k=2 where k=1;
UPDATE 1
insert into test values (1, 1) on conflict (k) do update set v=100;
(waits)
commit;
INSERT 0 1
select * from test;
k | v
---+-----
1 | 1
2 | 1
(2 rows)
commit;
Read Committed interacts with the following feature:
READ ONLY, the read point for each statement in a read committed transaction is selected as Now() - yb_follower_read_staleness_ms.A SET TRANSACTION ISOLATION LEVEL ... statement immediately issued after BEGIN; or BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED; will fail if the YB-TServer flag yb_enable_read_committed_isolation=true, and the following error will be issued:
ERROR: SET TRANSACTION ISOLATION LEVEL must not be called in a subtransaction
For more details, see #12494.
Read restart and serialization errors are not internally handled in read committed isolation in the following circumstances:
ysql_output_buffer_size flag, which has a default value of 256KB (see #11572).Non-transactional side-effects can occur more than once when a conflict or read restart occurs in functions or procedures in read committed isolation. This is because in read committed isolation, the retry logic in the database will undo all work done as part of that statement and re-attempt the whole client-issued statement. (See #12958)
Read Committed isolation has the following additional limitations when enable_wait_queues=false (see Wait-on-Conflict and Interaction with concurrency control):
yb_max_query_layer_retries YSQL configuration parameter will ensure that statements aren't stuck in deadlocks forever.This isolation level allows both phantom and non-repeatable reads (as demonstrated in SELECT behavior without explicit locking).
Adding this new isolation level does not affect the performance of existing isolation levels.
If a statement in the Read Committed isolation level faces a conflict, it is retried. If using Fail-on-Conflict concurrency control mode, the retries are done with exponential backoff until the statement times out or the yb_max_query_layer_retries are exhausted, whichever happens first. The following parameters control the backoff:
retry_max_backoff is the maximum backoff in milliseconds between retries.retry_min_backoff is the minimum backoff in milliseconds between retries.retry_backoff_multiplier is the multiplier used to calculate the next retry backoff.You can set these parameters on a per-session basis, or in the ysql_pg_conf_csv YB-TServer flag on cluster startup.
If the Wait-on-Conflict concurrency control policy is enabled, there won't be a need to manually tune these parameters for performance. Statements will restart only when all conflicting transactions have committed or rolled back, instead of retrying with an exponential backoff.