Postgres Bitmap Scans

This is the final part of a three part series exploring the different ways Postgres can retrieve rows from a table.

One major restriction with regular index scans is that they can only use a single index. Sometimes a query will have multiple filters on different columns which are indexed by different indexes. For example, if we may have a people table with age and weights fields both with indexes on them, and the following query:

FROM people
WHERE age > 30
  AND weight > 200;

Using an index scan, Postgres can either use an index to fetch all rows for people over 30 and post-filter for people with a weight over 200 or vice/versa. This means Postgres would have to fetch either all rows for people over 30 or all rows for people over 200 pounds from the table, even though the number of rows that satisfy both constraints may be relatively small. What we really want here is a way to locate all rows that satisfy both constraints and then fetch those rows from the table. That is exactly why a bitmap scan is useful.

A bitmap index scan is a way of combining multiple indexes. A bitmap index scan works by using the first index to locate all of the rows that satisfy the first filter, then using the second index to locate all indexes that satisfy the second filter, then intersecting the results to get the locations of all rows in the table that satisfy both filters. From there Postgres will fetch the rows from the physical table in the order the rows are in the physical table.

Fetching the rows from the table in the physical order they are in the table has several advantages. If it turns out a large portion of the table is returned, a bitmap scan becomes very similar to a sequential scan1. Under certain circumstances, this can even lead to a bitmap index scan being better than an orderinary index scan even when only a single index is used.

One downside to fetching rows in order of their physical location in the table is that Postgres loses any information about the sort order. This prevents Postgres from using a bitmap index scan in order to handle the ORDER BY part of a query even when the field being sorted by is indexed.

  1. It also means Postgres only has to read each database page from disk only once, whereas an index scan may have to read a page multiple times. This can reduce the total amount of I/O that needs to be performed.