Since Postgres 9.5, Postgres has supported a useful a feature called UPSERT. For a reason I can’t figure out, this feature is referred to as UPSERT, even though there is no UPSERT SQL command. In addition to being a useful feature, UPSERT is fairly interesting from a “behind the scenes” perspective as well.
If you haven’t noticed yet, the word “upsert” is a portmanteau of the words “update” and “insert”. As a feature, UPSERT allows you to insert a new data if that data does not already exist and specify an action to be performed instead if that data does already exist. More specifically, when there is a unique constraint on a column (a constraint specifying all values of a column are distinct from each other), UPSERT allow to say “insert this row if it does not violate the unique constraint, otherwise perform this action to resolve the conflict”.
As an example, let’s say we have a counters table where each row represents a counter. The table has two columns, id and value, where the id specifies the counter we are referring to, and value is the number of times the counter has been incremented. It would be nice if we could increment a counter without needing to create the counter in advance. This is a problem for UPSERT. First let’s create the table:
CREATE TABLE counters (id bigint UNIQUE, value bigint);
It’s important the the id column is marked as unique. Without that we would be unable to use UPSERT.
To write an UPSERT query, you first write a normal INSERT for the case when the constraint is not violated. In this case, when a counter with a given id does not already exist, we want to create a new counter with the given id and the value 1. An INSERT that does this looks like:
INSERT INTO counters (id, value) SELECT <id> AS id, 1 AS value;
Then to make it an UPSERT, you add to the end of it ON CONFLICT (<unique column>) DO <action>. The action can either be NOTHING, in which case the query will be ignored, or it can be UPDATE SET <column1> = <expr1>, <column2> = <expr2> … This will modify the existing row and update the corresponding columns to the new values. In this case we want to use the UPDATE form to increment the value of the counter. The whole query winds up looking like:
INSERT INTO counters SELECT <id> AS id, 0 AS value ON CONFLICT (id) DO UPDATE SET value = counters.value + 1;
When you run the above command with a given id, it will create a new counter with the value 1 if a counter with the id does not already exist. Otherwise it will increment the value of the existing counter. Here’s some examples of its use:
> SELECT * FROM counters; id | value ----+------- (0 rows) > INSERT INTO counters SELECT 0 AS id, 1 AS VALUE ON CONFLICT (id) DO UPDATE SET value = counters.value + 1; > SELECT * FROM counters; id | value ----+------- 0 | 1 (1 row) > INSERT INTO counters SELECT 0 AS id, 1 AS VALUE ON CONFLICT (id) DO UPDATE SET value = counters.value + 1; > SELECT * FROM counters; id | value ----+------- 0 | 2 (1 row) > INSERT INTO counters SELECT 0 AS id, 1 AS VALUE ON CONFLICT (id) DO UPDATE SET value = counters.value + 1; > SELECT * FROM counters; id | value ----+------- 0 | 3 (1 row) > INSERT INTO counters SELECT 1 AS id, 1 AS VALUE ON CONFLICT (id) DO UPDATE SET value = counters.value + 1; > SELECT * FROM counters; id | value ----+------- 0 | 3 1 | 1 (2 rows) > INSERT INTO counters SELECT 1 AS id, 1 AS VALUE ON CONFLICT (id) DO UPDATE SET value = counters.value + 1; > SELECT * FROM counters; id | value ----+------- 0 | 3 1 | 2
One last bit about UPSERT, you can use the faux table excluded to refer to the new row being inserted. This is useful if you either want to values of the old row with values of the new row, or make the values of the row a combination of the values of the old and new rows. As an example, let’s say we want to extend the counter example to increment by an arbitrary amount. That can be done with:
INSERT INTO counters SELECT <id> AS id, <amount> AS VALUE ON CONFLICT (id) DO UPDATE SET value = counters.value + excluded.value;
This even works if you are incrementing multiple counters simultaneously all by different amounts.
What makes UPSERT so interesting to me is that it works even in concurrent situations. UPSERT still works even if other INSERT and UPDATE queries are all running simultaneously! Prior to the UPSERT feature there was a fairly complex method to emulate UPSERT. That method involved using PL/pgSQL to alternate between running INSERT and UPDATE statements until one of them succeeded. The statements need to be ran in a loop because it is possible for a different INSERT to run before the UPSERT INSERT was ran, and a row could be deleted before the UPDATE could be ran. The UPSERT feature takes care of all of this for you, while at the same time providing a single command for the common pattern inserting data if it does not already exist and otherwise modifying the old data!