Postgres Hash Joins

This is part two of a three part series examining the Postgres join algorithms.

Of the join algorithms Postgres has available, the hash join is usually the fastest. The main downside is hash joins only work where the join condition is an equality condition. Let’s say we have the tables people (with fields id, name, and age) and pets (with fields id and owner_id), and the query:

SELECT *
FROM people, pets
WHERE people.id = pets.owner_id
  AND people.age > 30;

If Postgres were to use a hash join to execute this query, one way execution would proceed is as follows. First fetch all rows from the people table with an age over 30. Then construct a hash table containing all of those rows keyed by people.id. Then iterate through all of the pets and use the check if the owner of the pet is in the hash table. If the owner is in the hash table, Postgres knows the owner of the pet is over 30 and will add a row to the output of the join for that (person, pet) pair.

A hash join is only O(M+N) in the size of the inputs to the join. Although a hash join is only linear in the size of the inputs, there are some cases where it won’t be as fast as the other join algorithms. If it turns out that M is really small in comparison to N, an index join will likely wind up faster since an index join is O(M*log(N)).

The other major case when a hash join won’t be the preferred join algorithm is when Postgres thinks the hash table needed for the hash join won’t fit in memory1. If the hash join runs out of memory and spills over to disk, it will become much slower. There just isn’t a good way to perform a hash join on disk. The issue is that the time it takes to access disk is so much greater than the time it takes to access memory that it usually winds up being better to use an asymptotically worse algorithm that is designed to be performed on disk. When Postgres thinks there isn’t enough memory to perform the hash join, Postgres will usually use the last join algorithm instead, the merge join.

  1. In other words, when Postgres thinks the size of the hash table will exceed work_mem.

Leave a Reply

Your email address will not be published. Required fields are marked *