This post is part 3 in a three part series exploring transaction isolation issues in Postgres. Part 1 introduced some of the common problems that come up with transaction isolation in Postgres. Part 2 introduced row level locks, one way of dealing with some of the problems around transaction isolation. This part introduces the different transaction isolation levels and how they affect the different problems introduced in part 1.
The main, and easiest way to deal with the transaction isolation issues introduced in Part 1 is through changing the transaction isolation level. In Postgres each transaction has it’s own isolation level. The isolation level of a transaction determines the isolation semantics of the transaction. In other words, the transaction isolation level determines what the state of the database statements ran within the transaction see, as well as how concurrent modifications to the same row are resolved.
Postgres provides a total of three different transaction isolation levels: read committed, repeatable read, and serializable. Read committed is the default and provides the semantics we’ve been discussing so far. The isolation semantics described in part 1 around SELECT, UPDATE, and DELETE are the semantics provided under read committed. The repeatable read and serializable isolation levels each provide a different set of semantics for SELECT, UPDATE, and DELETE. The semantics under repeatable read are as follows:
Whenever a repeatable read transaction starts, it takes a snapshot of the current state of the database. All queries within the transaction use this snapshot. This behavior is opposed to the behavior of read committed in which each query has it’s own snapshot. The effects of all transactions that committed before the transaction are visible to all queries within the transaction while all transactions that either failed or committed after the transaction started are invisible to all statements ran in the transaction.
UPDATE and DELETE
Under repeatable read, UPDATE and DELETE use the snapshot created when the transaction started to find all rows to be modified. When an UPDATE or DELETE statement attempts to modify a row currently being modified by another transaction, it waits until the other transaction commits or aborts. If the other transaction aborts, the statement modifies the row and continues. On the other hand, if the other transaction commits, the UPDATE/DELETE statement will abort.
This behavior is completely different from that provided by read committed and somewhat surprising. In read committed, if two statements attempt to modify the same row at the same time, the modifications will be performed one after the other. In repeatable read, one of the statements will be aborted to prevent any isolation issues! This is generally the reason why people prefer read committed over repeatable read. When someone uses repeatable read, their code has to be prepared to retry transactions if any of them fail.
On it’s own, repeatable read prevents all of the transaction isolation problems but one. The only class of issues repeatable read does not prevent are serialization anomalies. That is what the serializable transaction isolation level is for. The serializable isolation level behaves exactly like repeatable read except it specifically detects when two transactions will not serialize correctly and will abort one of them to prevent a serialization anomaly. Like repeatable read, if you use serializable you should have code ready to retry aborted transactions.
To change the isolation level Postgres uses, you can set default_transaction_isolation to the desired level. All of the examples in the rest of this post make use of the repeatable read isolation levels, unless explicitly mentioned otherwise.
Now that we’ve got an understanding of the semantics behind the other isolation levels, let’s take a look at how they affect the examples in part 1.
S0> BEGIN; S0> INSERT INTO ints SELECT 1; S0> COMMIT; S1> BEGIN; S1> SELECT * FROM ints; n --- 1 S2> BEGIN; S2> UPDATE ints SET n = 2; S2> COMMIT; S1> SELECT * FROM ints; n --- 1 S1> COMMIT; S3> BEGIN; S3> SELECT * FROM ints; n --- 2 S3> COMMIT;
With the repeatable read isolation level, S1 avoids the non-repeatable read since, unlike read committed, the second query makes use of a snapshot created at the start of the transaction. This is where repeatable read gets it’s name, it can be used to avoid non-repeatable reads.
Due to the semantics around conflicting updates, repeatable read does prevent lost updates, but it does so in a less than ideal way:
S0> BEGIN; S0> INSERT INTO ints SELECT 1; S0> COMMIT; S1> BEGIN; S1> SELECT * FROM ints; n --- 1 S2> BEGIN; S2> SELECT * FROM ints; n --- 1 S2> UPDATE ints SET n = 2; -- Computed server side. S2> COMMIT; S1> UPDATE ints SET n = 2; -- Computed server side. ERROR: could not serialize access due to concurrent UPDATE S1> ROLLBACK; S3> BEGIN; S3> SELECT * FROM ints; n --- 2 S3> COMMIT;
What happened here is that since two UPDATEs attempted to modify the same row at the same time, Postgres aborted one of them to prevent a lost update. For this reason, if you are just trying to avoid lost updates, you should prefer to use row level locks with SELECT … FOR UPDATE under read committed. That will allow both UPDATEs to be performed, without either UPDATE being lost or aborted.
Repeatable read eliminates phantom reads for the same reason it eliminates non-repeatable reads:
S0> BEGIN; S0> SELECT count(*) FROM ints; count ------- 0 S1> BEGIN; S1> INSERT INTO ints SELECT 1; S1> COMMIT; S0> SELECT count(*) FROM ints; count ------- 0 S0> COMMIT; S2> BEGIN; S2> SELECT COUNT(*) FROM ints; count ------- 1 S2> COMMIT;
Preventing phantom reads is one reason why you would prefer to use the repeatable read isolation level instead of row level locks.
Just like with a lost update, repeatable read will abort one of the transactions in a skipped modification:
S0> BEGIN; S0> INSERT INTO ints SELECT 1; S0> INSERT INTO ints SELECT 2; S0> SELECT * FROM ints; n --- 1 2 S0> COMMIT S1> BEGIN; S1> UPDATE ints SET n = n+1; S2> BEGIN; S2> DELETE FROM ints WHERE n = 2; -- S2 blocks since the DELETE is trying to modify a row -- currently being updated. S1> COMMIT; -- S2 aborts with the error: ERROR: could not serialize access due to concurrent update S2> ROLLBACK; S3> BEGIN; S3> SELECT * FROM ints; n --- 2 3 S3> COMMIT;
S2 is aborted for the same reason S1 is aborted in the lost update example. The DELETE tries to modify a row which was modified after the snapshot was taken. Since the version of the row in the snapshot is out of date, Postgres aborts the transaction to prevent any isolation issues.
Unlike all of the other interactions, repeatable read does not eliminate serialization anomalies:
S0> BEGIN; S0> SELECT count(*) FROM ints; count ------- 0 (1 row) S1> BEGIN; S1> SELECT count(*) FROM ints; count ------- 0 (1 ROW) S1> INSERT INTO ints SELECT 1; S1> COMMIT; S0> INSERT INTO ints SELECT 1; S0> COMMIT;
Fortunately, if you do want to completely prevent serialization anomalies, you can use the serializable isolation level. If we use serializable in the example instead of repeatable read, here is what happens:
S0> BEGIN; S0> SELECT count(*) FROM ints; count ------- 0 S1> BEGIN; S1> SELECT count(*) FROM ints; count ------- 0 S1> INSERT INTO ints SELECT 1; S1> COMMIT; S0> INSERT INTO ints SELECT 1; ERROR: could not serialize access due to read/write dependencies among transactions DETAIL: Reason code: Canceled on identification as a pivot, during write. HINT: The transaction might succeed if retried. S0> ROLLBACK; S3> BEGIN; S3> SELECT * FROM ints; n --- 1 S3> COMMIT;
What happened here is that Postgres detected that the pattern of reads and writes wouldn’t serialize properly, so it aborted one of the transactions.
Like row level locks, the repeatable read and serializable isolation levels are simple, yet at the same time they introduce a lot of complexity. Use of either repeatable read or serializable dramatically increases the chances that any given transaction will fail, making code that interacts with the database much more complicated, and database performance much less predictable.
In general, if you can, you should try to use the read committed isolation level and write your code in such a way that you don’t run into the different isolation issues mentioned in part 1. If you absolutely have to, you can use the tools mentioned in these last two posts to fend off all of the isolation issues.