Being able to write a Postgres UDF (user-defined function) is a simple skill that goes a long way. SQL UDFs let you give a name to part or all of a SQL query and use that name to refer to that SQL code. It works just like any user-defined function in your favorite programming language.
As a simple example, in the last post we came up with a query for incrementing a counter in a table of counters:
INSERT INTO counters SELECT <id> AS id, <amount> AS VALUE ON CONFLICT (id) DO UPDATE SET value = counters.value + excluded.value;
When we used this query multiple times, we had to copy and paste it once for each time we used it. To avoid this problem, we could define a UDF that runs the query and then only increment the counters through the UDF. In general, most of the time when you define a SQL UDF, you’ll use code like the following:
CREATE OR REPLACE FUNCTION <function name>(<arguments>) RETURNS <return type> AS $$ <queries to run> $$ LANGUAGE SQL;
This will define a UDF with the given name that runs the queries in the body whenever it is called. Inside of the queries, you’ll be able to refer to any of the arguments passed to the function. If we convert the query we had for incrementing a counter into a UDF, we wind up with the following UDF definition:
CREATE OR REPLACE FUNCTION increment_counter(counter_id bigint, amount bigint) -- Use void as the return type because this function -- returns no value. RETURNS void AS $$ INSERT INTO counters SELECT counter_id AS id, amount AS value ON CONFLICT (id) DO UPDATE SET value = counters.value + excluded.value; $$ LANGUAGE SQL;
With this UDF we can now use the UDF instead of the original query:
> SELECT * FROM counters; id | value ----+------- (0 rows) > SELECT increment_counter(1, 10); > SELECT * FROM counters; id | value ----+------- 1 | 10 (1 row) > SELECT increment_counter(1, 5); > SELECT * FROM counters; id | value ----+------- 1 | 15 (1 row) > SELECT increment_counter(2, 5); > SELECT * FROM counters; id | value ----+------- 1 | 15 2 | 5 (2 rows) > SELECT increment_counter(3, 20); > SELECT * FROM counters; id | value ----+------- 1 | 15 2 | 5 3 | 20 (3 rows)
This is much better than what we had before.
One of the more interesting classes of UDFs are those that return rows instead of a single result. To define such a UDF, you specify SETOF TABLE (<columns>) as the return type. For example, if we wanted a UDF that returned the top N counters, we could define one as such:
CREATE OR REPLACE FUNCTION top_counters(n bigint) RETURNS TABLE (id bigint, value bigint) AS $$ SELECT * FROM counters ORDER BY value DESC LIMIT n; $$ LANGUAGE SQL;
Then we can use it like:
> SELECT * FROM top_counters(2); id | value ----+------- 3 | 20 1 | 15 (2 rows)
You can then use the function as part of a larger SQL query. For example, if you wanted to find the sum of the values of the top 10 counters, you could do that with the following straightforward SQL query:
SELECT sum(value) FROM top_counters(10);
To recap, UDFs are a great way to simplify SQL queries. I find them to be especially useful when I am reusing the same subquery in a bunch of different places.