Published: Sun 05 August 2018 tags: postgresql

How expensive is a join?

It depends! It depends what the join criteria is, what indexes are present, how big the tables are, whether the relations are cached, what hardware is being used, what configuration parameters are set, whether statistics are up-to-date, what other activity is happening on the system, to name a few things.

But we can still try and get a feel for what a couple simple scenarios look like and see what happens when:

The number of tables being joined increases The number of rows in those tables increases Indexes are present / not present

My inspiration is a question that comes up over and over again when designing the schema for an upcoming feature. For example, if we have an existing table of products and we want to add the notion of a "status" for each product. "Active", "Discontinued", "Recalled", etc. Do we:

Add a status_id column to the product table and reference a new status table Add a status_id column to the product table and let the app define the mapping of what each status_id is Add a status column of type text to the product table

I usually argue for the first option. Arguments for options two and three usually revolve around two points. Concerns over the performance of the join, and developer ergonomics. The developer ergonomics is a matter of taste, but we can certainly look at the join performance.

We'll go nuts and really put PostgreSQL to the test. For some straightforward equi-joins - the kind a lookup table would be doing - let's see what happens to performance when the number of joins goes through the roof. How many is too many?

Here's how we'll be creating our tables for testing. Scroll down to the charts if you just want to see the results.

DROP FUNCTION IF EXISTS create_tables ( integer , integer , boolean ); CREATE FUNCTION create_tables ( num_tables integer , num_rows integer , create_indexes boolean ) RETURNS void AS $ function_text $ BEGIN -- There's no table before the first one, so this one's a little different. Create it here instead of in our loop. DROP TABLE IF EXISTS table_1 CASCADE ; CREATE TABLE table_1 ( id serial primary key ); -- Populate the first table INSERT INTO table_1 ( id ) SELECT nextval ( 'table_1_id_seq' ) FROM generate_series ( 1 , num_rows ); -- Create and populate all the other tables FOR i IN 2.. num_tables LOOP EXECUTE 'DROP TABLE IF EXISTS table_' || i || ' CASCADE;' ; EXECUTE format ( $$ CREATE TABLE table_%1$s ( id serial primary key, table_%2$s_id integer references table_%2$s (id) ); INSERT INTO table_%1$s (table_%2$s_id) SELECT id FROM table_%2$s ORDER BY random(); $$ , i , i - 1 ); IF create_indexes THEN EXECUTE 'CREATE INDEX ON table_' || i || ' (table_' || i - 1 || '_id);' ; END IF ; END LOOP ; END ; $ function_text $ LANGUAGE plpgsql ; -- We'll want to make sure PostgreSQL has an idea of what's in these tables DROP FUNCTION IF EXISTS analyze_tables ( integer ); CREATE FUNCTION analyze_tables ( num_tables integer ) RETURNS void AS $ function_text $ BEGIN FOR i IN 1.. num_tables LOOP EXECUTE 'ANALYZE table_' || i || ';' ; END LOOP ; END ; $ function_text $ LANGUAGE plpgsql ;

Here's create_tables() in action...

SELECT create_tables ( 10 , 10000 , False ); SELECT * from table_1 limit 10 ; id ---- 1 2 3 4 5 6 7 8 9 10 ( 10 rows ) SELECT * from table_2 limit 10 ; id | table_1_id ----+------------ 1 | 824 2 | 973 3 | 859 4 | 789 5 | 901 6 | 112 7 | 162 8 | 212 9 | 333 10 | 577 ( 10 rows )

Cool, now we can create as many tables as we want.

We also need a way to query them to test our join performance. These are going to be some pretty long queries, and we don't want to write them by hand, so we'll make another function to write them for us. We'll tell it how many tables to join together and the max id of the last table to put in the WHERE clause.

DROP FUNCTION IF EXISTS get_query ( integer , integer ); CREATE FUNCTION get_query ( num_tables integer , max_id integer ) RETURNS text AS $ function_text $ DECLARE first_part text ; second_part text ; third_part text ; where_clause text ; BEGIN first_part := $ query $ SELECT count(*) FROM table_1 AS t1 INNER JOIN$ query $ ; second_part := '' ; FOR i IN 2.. num_tables - 1 LOOP second_part := second_part || format ( $ query $ table_%1$s AS t%1$s ON t%2$s.id = t%1$s.table_%2$s_id INNER JOIN$ query $ , i , i - 1 ); END LOOP ; third_part := format ( $ query $ table_%1$s AS t%1$s ON t%2$s.id = t%1$s.table_%2$s_id WHERE t1.id <= %3$s$ query $ , num_tables , num_tables - 1 , max_id ); RETURN first_part || second_part || third_part || ';' ; END ; $ function_text $ LANGUAGE plpgsql ;

Here's an example of the query it produces.

SELECT get_query ( 5 , 10 ); get_query -------------------------------------------------- + SELECT + count ( * ) + FROM + table_1 AS t1 INNER JOIN + table_2 AS t2 ON + t1 . id = t2 . table_1_id INNER JOIN + table_3 AS t3 ON + t2 . id = t3 . table_2_id INNER JOIN + table_4 AS t4 ON + t3 . id = t4 . table_3_id INNER JOIN + table_5 AS t5 ON + t4 . id = t5 . table_4_id + WHERE + t1 . id <= 10 ; ( 1 row ) Time : 1.404 ms

Great. Let's take a minute to consider what we're asking postgres to do when we run this query. We're asking, how many rows in table_5 have table_4_id values that are in table_4, which have table_3_id rows that are in table_3, which have table_2_id rows that are in table_2, which have table_1_id values that are in table_1, and are <= 10.

Let's go ahead and run it...

SELECT count ( * ) FROM table_1 AS t1 INNER JOIN table_2 AS t2 ON t1 . id = t2 . table_1_id INNER JOIN table_3 AS t3 ON t2 . id = t3 . table_2_id INNER JOIN table_4 AS t4 ON t3 . id = t4 . table_3_id INNER JOIN table_5 AS t5 ON t4 . id = t5 . table_4_id WHERE t1 . id <= 10 ; count ------- 10 ( 1 row ) Time : 40.494 ms

We can throw an EXPLAIN ANALYZE on it and see what it's doing.

EXPLAIN ANALYZE SELECT count ( * ) FROM table_1 AS t1 INNER JOIN table_2 AS t2 ON t1 . id = t2 . table_1_id INNER JOIN table_3 AS t3 ON t2 . id = t3 . table_2_id INNER JOIN table_4 AS t4 ON t3 . id = t4 . table_3_id INNER JOIN table_5 AS t5 ON t4 . id = t5 . table_4_id WHERE t1 . id <= 10 ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate ( cost = 827.93..827.94 rows = 1 width = 8 ) ( actual time = 43.392..43.392 rows = 1 loops = 1 ) -> Hash Join ( cost = 645.31..827.90 rows = 9 width = 0 ) ( actual time = 35.221..43.353 rows = 10 loops = 1 ) Hash Cond : ( t5 . table_4_id = t4 . id ) -> Seq Scan on table_5 t5 ( cost = 0.00..145.00 rows = 10000 width = 4 ) ( actual time = 0.024..3.984 rows = 10000 loops = 1 ) -> Hash ( cost = 645.20..645.20 rows = 9 width = 4 ) ( actual time = 34.421..34.421 rows = 10 loops = 1 ) Buckets : 1024 Batches : 1 Memory Usage : 9 kB -> Hash Join ( cost = 462.61..645.20 rows = 9 width = 4 ) ( actual time = 25.281..34.357 rows = 10 loops = 1 ) Hash Cond : ( t4 . table_3_id = t3 . id ) -> Seq Scan on table_4 t4 ( cost = 0.00..145.00 rows = 10000 width = 8 ) ( actual time = 0.022..4.828 rows = 10000 loops = 1 ) -> Hash ( cost = 462.50..462.50 rows = 9 width = 4 ) ( actual time = 23.519..23.519 rows = 10 loops = 1 ) Buckets : 1024 Batches : 1 Memory Usage : 9 kB -> Hash Join ( cost = 279.91..462.50 rows = 9 width = 4 ) ( actual time = 12.617..23.453 rows = 10 loops = 1 ) Hash Cond : ( t3 . table_2_id = t2 . id ) -> Seq Scan on table_3 t3 ( cost = 0.00..145.00 rows = 10000 width = 8 ) ( actual time = 0.017..5.065 rows = 10000 loops = 1 ) -> Hash ( cost = 279.80..279.80 rows = 9 width = 4 ) ( actual time = 12.221..12.221 rows = 10 loops = 1 ) Buckets : 1024 Batches : 1 Memory Usage : 9 kB -> Hash Join ( cost = 8.55..279.80 rows = 9 width = 4 ) ( actual time = 0.293..12.177 rows = 10 loops = 1 ) Hash Cond : ( t2 . table_1_id = t1 . id ) -> Seq Scan on table_2 t2 ( cost = 0.00..145.00 rows = 10000 width = 8 ) ( actual time = 0.017..5.407 rows = 10000 loops = 1 ) -> Hash ( cost = 8.44..8.44 rows = 9 width = 4 ) ( actual time = 0.054..0.054 rows = 10 loops = 1 ) Buckets : 1024 Batches : 1 Memory Usage : 9 kB -> Index Only Scan using table_1_pkey on table_1 t1 ( cost = 0.29..8.44 rows = 9 width = 4 ) ( actual time = 0.024..0.035 rows = 10 loops = 1 ) Index Cond : ( id <= 10 ) Heap Fetches : 10 Planning time : 1.659 ms Execution time : 43.585 ms ( 26 rows )

A bunch of sequential scans except for the primary key index on table_1. And what else could it do? We didn't opt to create any indexes.

If we redo this experiment and tell create_tables() to create indexes...

SELECT create_tables ( 10 , 10000 , True );

And re-run our query, we get a different plan

EXPLAIN ANALYZE SELECT count ( * ) FROM table_1 AS t1 INNER JOIN table_2 AS t2 ON t1 . id = t2 . table_1_id INNER JOIN table_3 AS t3 ON t2 . id = t3 . table_2_id INNER JOIN table_4 AS t4 ON t3 . id = t4 . table_3_id INNER JOIN table_5 AS t5 ON t4 . id = t5 . table_4_id WHERE t1 . id <= 10 ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate ( cost = 88.52..88.53 rows = 1 width = 8 ) ( actual time = 0.411..0.411 rows = 1 loops = 1 ) -> Nested Loop ( cost = 1.43..88.50 rows = 9 width = 0 ) ( actual time = 0.067..0.399 rows = 10 loops = 1 ) -> Nested Loop ( cost = 1.14..85.42 rows = 9 width = 4 ) ( actual time = 0.054..0.304 rows = 10 loops = 1 ) -> Nested Loop ( cost = 0.86..82.34 rows = 9 width = 4 ) ( actual time = 0.043..0.214 rows = 10 loops = 1 ) -> Nested Loop ( cost = 0.57..79.25 rows = 9 width = 4 ) ( actual time = 0.032..0.113 rows = 10 loops = 1 ) -> Index Only Scan using table_1_pkey on table_1 t1 ( cost = 0.29..8.44 rows = 9 width = 4 ) ( actual time = 0.015..0.023 rows = 10 loops = 1 ) Index Cond : ( id <= 10 ) Heap Fetches : 10 -> Index Scan using table_2_table_1_id_idx on table_2 t2 ( cost = 0.29..7.86 rows = 1 width = 8 ) ( actual time = 0.007..0.007 rows = 1 loops = 10 ) Index Cond : ( table_1_id = t1 . id ) -> Index Scan using table_3_table_2_id_idx on table_3 t3 ( cost = 0.29..0.33 rows = 1 width = 8 ) ( actual time = 0.008..0.008 rows = 1 loops = 10 ) Index Cond : ( table_2_id = t2 . id ) -> Index Scan using table_4_table_3_id_idx on table_4 t4 ( cost = 0.29..0.33 rows = 1 width = 8 ) ( actual time = 0.007..0.008 rows = 1 loops = 10 ) Index Cond : ( table_3_id = t3 . id ) -> Index Only Scan using table_5_table_4_id_idx on table_5 t5 ( cost = 0.29..0.33 rows = 1 width = 4 ) ( actual time = 0.007..0.008 rows = 1 loops = 10 ) Index Cond : ( table_4_id = t4 . id ) Heap Fetches : 10 Planning time : 2.287 ms Execution time : 0.546 ms ( 19 rows )

Sure enough, indexes are being used and it's faster. Nice to see that things are working as expected. We're now ready to string a bunch of these together and see what happens when the number of tables and rows grows.

These tests have been performed on an AWS RDS db.m4.large instance. This is the cheapest instance that doesn't have 'burstable' performance, which could get in the way of our benchmarks. We perform each query 10 times and take the average.

First up is a run that tests queries with the number of tables joined ranging from 2 to 200, three different settings for number of rows, and with no indexes.

Pretty good performance. More importantly, we can now calculate the cost of each additional join! For tables with 1000 rows, here's how much each additional table increases the execution time:

Number of tables Average increase Time (ms) 2 to 50 (0.012738 - 0.000327) / (50 - 2) = 0.259 50 to 100 (0.0353395 - 0.012738) / (100 - 50) = 0.452 100 to 150 (0.0762056 - 0.0353395) / (150 - 100) = 0.817 150 to 200 (0.1211591 - 0.0762056) / (200 - 150) = 0.899

Even as the number of tables being joined approaches 200, each additional table being joined only increases the execution time by less than a millisecond.

When talking about creating a table for storing product statuses, tables with 1000 rows is probably overkill. So, even without giving a thought to indexing, adding a reference table probably isn't going to kill performance.

But since we've got this benchmark machinery handy, what about larger tables? Here's the same test, but with tables containing up to a million rows each. They only go up to 50 tables being joined this time, since it takes a while to run and I have a limited RDS budget / patience.

Pretty clear separation here. Looking at the largest case of 1,000,000 row tables, it's a ~93ms increase in time for every additional table joined, whether we're adding the 5th, or the 50th table.

Number of tables Average increase Time (ms) 2 to 10 (0.8428495 - 0.0924571) / (10 - 2) = 93.799 10 to 20 (1.781959 - 0.8428495) / (20 - 10) = 93.911 20 to 30 (2.708342 - 1.781959) / (30 - 20) = 92.638 30 to 40 (3.649164 - 2.708342) / (40 - 30) = 94.082 40 to 50 (4.565644 - 3.649164) / (50 - 40) = 91.648

This has all been sequential scans so far, so let's add some indexes and see how that changes things.

A pretty huge impact. It's interesting that when indexes are being used it almost doesn't matter how many rows are in the table, as we can see their times are all more or less together. The run of 100,000 row tables is usually the slowest, but not always.

Number of tables Average increase Time (ms) 2 to 50 (0.0119917 - 0.000265) / (50 - 2) = 0.244 50 to 100 (0.035345 - 0.0119917) / (100 - 50) = 0.467 100 to 150 (0.0759236 - 0.035345) / (150 - 100) = 0.811 150 to 200 (0.1378461 - 0.0759236) / (200 - 150) = 1.238

Even with a query that already joins 150 tables w/ 100k rows each, adding another table is only an additional 1.2 ms increase. Cool!

In this last run, I couldn't sit through the time it takes to generate 200 1M row tables and their indexes, but I still wanted to see how tables that big would perform. So back down to 50 tables we go...

Number of tables Average increase Time (ms) 2 to 10 (0.0016811 - 0.000276) / (10 - 2) = 0.176 10 to 20 (0.003771 - 0.0016811) / (20 - 10) = 0.209 20 to 30 (0.0062328 - 0.003771) / (30 - 20) = 0.246 30 to 40 (0.0088621 - 0.0062328) / (40 - 30) = 0.263 40 to 50 (0.0120818 - 0.0088621) / (50 - 40) = 0.322

Based on the improvement we saw with indexes previously, it's not too much of a surprise that we get great performance with our million row tables as well. But still, joining 50 tables with 1M rows each, in just 12ms. Wow!

This is great, and maybe that extra join operation is cheaper than we thought. But... one thing we also need to consider is that while the time it takes to perform an additional join operation itself may be small, more tables means more possible plans for the query planner to consider, making it more difficult for it to find the best one. For example, when the number of joins goes beyond geqo_threshold, which by default is set to 12, postgres will stop considering all possible query plans, and switch to a genetic algorithm instead. This could change the query plan, and negatively affect performance.

So your mileage may vary, and it's important to test your queries on your data. But we can see that an additional join certainly can be quite cheap, and isn't necessarily a good reason to avoid normalizing your data.