Postgres Heap Only Tuples

Heap only tuples, otherwise known as HOT, is an optimization Postgres uses to reduce the amount of I/O necessary for updates. Due to MVCC, an update in Postgres consists of finding the row being updated, and inserting a new version of the row back into the database. The main downside to doing this is the need to readd the row to every index. This requires a lot more I/O because the row has to be reinserted into every index over the table. Readding the row to each index is necessary because the physical location of the new version of the row is different from the physical location of the old version.

To reduce the amount of I/O necessary for UPDATE, the Postgres team added HOT to Postgres. The idea behind HOT is relatively simple. When updating a row, if it is possible, Postgres will place the new copy of the row right next to the old copy of the row. By doing this and marking the old copy row so that Postgres knows the new copy of the row is right next it, Postgres can avoid updating all of the indexes. During an index scan in which the new copy of the row satisfies the filter, Postgres will encounter the old copy of the row. Since the old copy has been specially marked, Postgres will know the new copy is right next to it and can use that instead. This way, Postgres can pretend that all of the indexes point to the new copy of the row, without ever needing to modify the indexes.

Currently HOT is only possible when none of the columns being updated are indexed. If any of the columns being updated are indexed, HOT is no longer possible. There are several issues that come up if HOT were attempted in this situation. Specially when an index scan is performed on the index on the column which was updated, and the old copy of the row satisfied the predicate of the scan, but not the new one. In this situation, Postgres will try to use the index to quickly find all rows that satisfy the predicate of the query, and in the case of the tuple updated with HOT, will get back the new copy of the row which does not satisfy the predicate. By adding this restriction, Postgres can guarantee that when it tries to find rows that satisfy a predicate through an index, if the predicate is satisfied by the old copy of the row, it is also satisfied by the new copy and vice/versa.

There is currently in development, an extension to HOT, called WARM, that works even when an index on an updated column exists. The idea behind WARM, is to place the new row next to the old row and reinsert the row in the indexes on the columns that changed. This dramatically complicates the situation mentioned above as now Postgres needs someway to determine whether the row actually satisfies the filter or not.

One thought on “Postgres Heap Only Tuples

Leave a Reply

Your email address will not be published. Required fields are marked *