I was just reviewing some old code written for pre-8.4 PostgreSQL, and I saw something really nifty. I remember having a custom function do some of this back in the day, but I forgot what pre- array_agg() looked like. For review, modern aggregation is written like this.

SELECT array_agg(x ORDER BY x DESC) FROM foobar;

However, once upon a time, it was written like this,

SELECT ARRAY(SELECT x FROM foobar ORDER BY x DESC);

So, I tried it with some test data..

CREATE TEMP TABLE foobar AS SELECT * FROM generate_series(1,1e7) AS t(x);

The results were surprising.. The #OldSchoolCool way was massively faster: a 25% speedup. Moreover, simplifying it without the ORDER, showed the same slowness.

# EXPLAIN ANALYZE SELECT ARRAY(SELECT x FROM foobar); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Result (cost=104425.28..104425.29 rows=1 width=0) (actual time=1665.948..1665.949 rows=1 loops=1) InitPlan 1 (returns $0) -> Seq Scan on foobar (cost=0.00..104425.28 rows=6017728 width=32) (actual time=0.032..716.793 rows=10000000 loops=1) Planning time: 0.068 ms Execution time: 1671.482 ms (5 rows) test=# EXPLAIN ANALYZE SELECT array_agg(x) FROM foobar; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=119469.60..119469.61 rows=1 width=32) (actual time=2155.154..2155.154 rows=1 loops=1) -> Seq Scan on foobar (cost=0.00..104425.28 rows=6017728 width=32) (actual time=0.031..717.831 rows=10000000 loops=1) Planning time: 0.054 ms Execution time: 2174.753 ms (4 rows)

So, what's going on here. Why is array_agg, an internal function so much slower than the planner's SQL voodoo?

Using "PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005, 64-bit"