Let's start with getting the same results as you did. Instead of snapchat data, I've just generated test data using generate_series (see Appendix below). I'm running on 9.3.2 and was able to get the same executionplans as you posted with the scripts mentioned in the Appendix. Now, lets look at your questions: Why does PG do a Bitmap Heap Scan (not Index Only Scan?) First of all, you seem to expect to get the Bitmap Index Scan operation without Bitmap Heap Scan . Although that sounds logical if you'd expect an index only scan, it's not logical if you keep in mind what these Bitmap operations try to accomplish. The Bitmap Index+Heap Scan operations are an optimization of the regular Index Scan: Instead of accessing the Heap right after fetching a row from the index, the Bitmap Index Scan completes the index lookup first, keeping track of all rows that might be interesting (in a, you guess it, bitmap). The Bitmap Heap Scan than accesses all the interesting heap pages sequentially. For that, it first sorts the bitmap. The aim is to reduce the random IO by using a little memory and CPU for the bitmap. The Bitmap Index+Heap Scan's make only sense if you access many rows. Further the Bitmap itself makes only sense if you intend to fetch data from the heap. Consequently, the Bitmap operations are not the right tool to implement index-only scans. Only the Index Only Scan operation actually implements index-only scans. So, why doesn't it do an Index Only Scan? Simple: Because you didn't VACUUM ANALYZE the table: vacuum analyze records; Right after that, the execution plan looks like this: Sort (cost=360661.70..360662.81 rows=445 width=11) (actual time=3869.051..3869.160 rows=445 loops=1) Sort Key: ((SubPlan 1)) Sort Method: quicksort Memory: 34kB -> Seq Scan on areacodes (cost=0.00..360642.12 rows=445 width=11) (actual time=8.336..3868.565 rows=445 loops=1) SubPlan 1 -> Aggregate (cost=810.40..810.41 rows=1 width=0) (actual time=8.690..8.691 rows=1 loops=445) -> Index Only Scan using records_on_phone on records (cost=0.44..753.30 rows=22843 width=0) (actual time=0.141..7.064 rows=6470 loops=445) Index Cond: ((phone > ((areacodes.areacode)::text || '0000000'::text)) AND (phone < ((areacodes.areacode)::text || '9999999'::text))) Heap Fetches: 0 Total runtime: 3869.311 ms The reason this works is two-fold: * Analyze - Analyzing the table after filling it provides correct statistics to the optimizer to make it's estimates. * Vacuum - Here it comes: PostgreSQL's Index-Only Scan only works on vacuumed tables. If you don't vacuum, even the Index Only Scan operation must visit the heap to check if the row is visible. This is just a limitation how PostgreSQL implements Indexes (without visibility information) and how they made Index Only Scans working nevertheless—if vacuumed ;) Now let's continue with questions you didn't ask: But what about the original query? Considering that you have an index on phone wouldn't it be possible to use this index to avoid the sort operation? Unfortunately not, because you are not sorting on phone but on substring(phone, 0,4) . Although both expressions yield the same order, the optimizer doesn't know this and can therefore not use the index to avoid the sort operation. But that means nothing more than we need to create an index that has the order the group by needs — an expression index: create index phone_on_ac on records (substring(phone, 0,4)); Now, the index order is in line with the order required by the sort operation and the execution plan looks like this: -- unchanged, still doing sort -- Hmm, let's analyze the table again, just because that should be done when adding expression indexes. Now the execution plan looks like this: Sort (cost=111548.20..111550.43 rows=894 width=47) (actual time=6904.681..6904.902 rows=900 loops=1) Sort Key: (count(*)) Sort Method: quicksort Memory: 52kB -> Hash Left Join (cost=111471.97..111504.38 rows=894 width=47) (actual time=6902.616..6903.989 rows=900 loops=1) Hash Cond: (("substring"((records.phone)::text, 0, 4)) = (ac.areacode)::text) -> HashAggregate (cost=111458.96..111470.13 rows=894 width=8) (actual time=6902.041..6902.670 rows=900 loops=1) -> Seq Scan on records (cost=0.00..88615.82 rows=4568626 width=8) (actual time=0.024..3825.362 rows=4568628 loops=1) -> Hash (cost=7.45..7.45 rows=445 width=11) (actual time=0.548..0.548 rows=445 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 14kB -> Seq Scan on areacodes ac (cost=0.00..7.45 rows=445 width=11) (actual time=0.012..0.224 rows=445 loops=1) Total runtime: 6905.132 ms Although the new index isn't used, the sort operation has gone. Instead it uses a HashAggregation (was previously GroupAggregate). Here is the fun part: the pure presence of the index (and new stats!) has influenced the execution plan and execution speed, even though the index wasn't used. But as a matter of fact, the statistics about the index were used during optimization. Note the following difference between the execution plans: GroupAggregate (cost=781375.40..872747.92 *rows=4568626* width=8) (actual ... rows=900) HashAggregate (cost=111458.96..111470.13 *rows=894* width=8) (actual ... rows=900) Without the index, and it's statistics, the optimize estimated that the group by operation will yield 4.5 million rows (essentially not grouping anything). However, as explain analyze shows in the actual part, it yields only 900 rows. However, after creating the index (and stats!) the optimizer expects 894 rows from the group by. In this case, the HashAggregate operation is more favorable. HashAggregate doesn't need to sort at all. Wanna see the statistic that makes all the difference? select * from pg_statistic join pg_class on ( pg_statistic.starelid = pg_class.oid and pg_statistic.staattnum=1 -- should come from pg_attribute, lazy me ) where relname='phone_on_ac'; Note how the result of this query doesn't change by just creating the index, but by analyze after creating the index. This problems might be more nicely solvable (just adding stats without the useless index) by a feature like Cross-Column Statistics. However, this feature doesn't exist yet, so it is hard to predict if it would be able to solve this problem too. But why the heck does it still not use the index? Here comes a very unfortunate answer: PostgreSQL, as of 9.3, can't use expression indexes for index only scans (ref). That means, the query planner can vary only these options * SeqScan vs. Index Only Scan on the full phone index * HashAggragate vs. Sort+GroupAggregate. Obviously, HashAggregate is the better option. Further, a SeqScan does sequential IO while a Index Only Scan does scattered IO - hence the SeqScan wins (especially because it cannot benefit from the index order!). But how fast would it be, if the expression index could be used with GroupAggregate? To answer the question how fast it could possibly get, I've added the areacode to the records table itself. Now we can create a straight index on areacode and try to use it with GroupAggregate without sort + Index Only Scan. create index records_ac on records(areacode); select ac.name, ac_cnt.areacode, ac_cnt.count from (select areacode as areacode, count(*) as count from records group by areacode) as ac_cnt left join areacodes as ac on ac_cnt.areacode = ac.areacode order by ac_cnt.count desc; And it still does a SeqScan: Sort (cost=100124.88..100127.13 rows=899 width=19) (actual time=4742.925..4743.157 rows=900 loops=1) Sort Key: (count(*)) Sort Method: quicksort Memory: 52kB -> Hash Left Join (cost=100050.43..100080.77 rows=899 width=19) (actual time=4740.851..4742.264 rows=900 loops=1) Hash Cond: ((records.areacode)::text = (ac.areacode)::text) -> HashAggregate (cost=100037.42..100046.41 rows=899 width=4) (actual time=4740.279..4740.889 rows=900 loops=1) -> Seq Scan on records (cost=0.00..77194.28 rows=4568628 width=4) (actual time=0.043..1441.143 rows=4568628 loops=1) -> Hash (cost=7.45..7.45 rows=445 width=11) (actual time=0.547..0.547 rows=445 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 14kB -> Seq Scan on areacodes ac (cost=0.00..7.45 rows=445 width=11) (actual time=0.012..0.222 rows=445 loops=1) Total runtime: 4743.384 ms Obviously for the same reason as before (sequential IO vs. random IO). So let's "disable" SeqScan just to see the IOS: set enable_seqscan = off; Sort (cost=10000131572.45..10000131574.70 rows=899 width=19) (actual time=3508.713..3508.932 rows=900 loops=1) Sort Key: (count(*)) Sort Method: quicksort Memory: 52kB -> Hash Left Join (cost=10000000013.45..10000131528.35 rows=899 width=19) (actual time=5.158..3507.918 rows=900 loops=1) Hash Cond: ((records.areacode)::text = (ac.areacode)::text) -> GroupAggregate (cost=0.43..131493.98 rows=899 width=4) (actual time=4.620..3506.353 rows=900 loops=1) -> Index Only Scan using records_ac on records (cost=0.43..108641.85 rows=4568628 width=4) (actual time=0.050..1841.752 rows=4568628 loops=1) Heap Fetches: 0 -> Hash (cost=10000000007.45..10000000007.45 rows=445 width=11) (actual time=0.531..0.531 rows=445 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 14kB -> Seq Scan on areacodes ac (cost=10000000000.00..10000000007.45 rows=445 width=11) (actual time=0.012..0.231 rows=445 loops=1) Total runtime: 3509.155 ms So, we have finally got what we expected: IOS + GroupAggregate (not needing any memory, because of the pre-sorted data = pipelined execution). The execution time is the best we've seen so far. Note that disabling the SeqScan is just done for demo purposes. In Production you would probably need to fiddle with the costing parameters. Please also not that a part of this speedup is also because the substring expression is not evaluated during execution anymore. However, this is the performance you could get if PG would be able to use an expression index for IOS. Conclusion Although PG can currently not use an expression index for IOS, the statistics for the expression index are still useful for the optimizer to chose the HashAggregate over Sort+GroupAggregate. Once taken tha HashAgg performance is way better (7s). The proper IOS on an expression index would cut this execution time down to the half again. Your alternative query using the sub-select in the select list might still be an option but doesn't consider record s without areacode mapping. If it is faster, than only because you miss many areacode mappings. Appendix Creating the test data create table records ( phone varchar(255) not null, areacode varchar(255) not null, -- for later demo username varchar(255) not null ); insert into records select gen, substring(gen::text, 0,4), 'user ' || gen from generate_series('1234567', '78901234', 17) gen; create table areacodes ( areacode varchar(255) not null, name varchar(255) not null ); insert into areacodes select gen, 'ac ' || gen from generate_series(123, 567) gen; create index records_on_phone on records (phone); Execution plan with original query Sort (cost=1908248.88..1919670.45 rows=4568626 width=47) (actual time=51699.907..51700.126 rows=900 loops=1) Sort Key: (count(*)) Sort Method: quicksort Memory: 52kB -> Hash Left Join (cost=781388.42..981265.80 rows=4568626 width=47) (actual time=45865.108..51699.041 rows=900 loops=1) Hash Cond: (("substring"((records.phone)::text, 0, 4)) = (ac.areacode)::text) -> GroupAggregate (cost=781375.40..872747.92 rows=4568626 width=8) (actual time=45864.545..51697.405 rows=900 loops=1) -> Sort (cost=781375.40..792796.97 rows=4568626 width=8) (actual time=45858.621..49677.266 rows=4568628 loops=1) Sort Key: ("substring"((records.phone)::text, 0, 4)) Sort Method: external merge Disk: 62496kB -> Seq Scan on records (cost=0.00..88615.82 rows=4568626 width=8) (actual time=0.513..3902.085 rows=4568628 loops=1) -> Hash (cost=7.45..7.45 rows=445 width=11) (actual time=0.542..0.542 rows=445 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 14kB -> Seq Scan on areacodes ac (cost=0.00..7.45 rows=445 width=11) (actual time=0.013..0.230 rows=445 loops=1) Total runtime: 51712.268 ms Execution plan with sub-query Sort (cost=13981009.24..13981010.35 rows=445 width=11) (actual time=4352.651..4352.759 rows=445 loops=1) Sort Key: ((SubPlan 1)) Sort Method: quicksort Memory: 34kB -> Seq Scan on areacodes (cost=0.00..13980989.67 rows=445 width=11) (actual time=9.274..4351.934 rows=445 loops=1) SubPlan 1 -> Aggregate (cost=31417.93..31417.94 rows=1 width=0) (actual time=9.775..9.776 rows=1 loops=445) -> Bitmap Heap Scan on records (cost=530.58..31360.82 rows=22843 width=0) (actual time=5.561..8.145 rows=6470 loops=445) Recheck Cond: (((phone)::text > ((areacodes.areacode)::text || '0000000'::text)) AND ((phone)::text < ((areacodes.areacode)::text || '9999999'::text))) -> Bitmap Index Scan on records_on_phone (cost=0.00..524.87 rows=22843 width=0) (actual time=5.543..5.543 rows=6470 loops=445) Index Cond: (((phone)::text > ((areacodes.areacode)::text || '0000000'::text)) AND ((phone)::text < ((areacodes.areacode)::text || '9999999'::text))) Total runtime: 4352.912 ms