SQL Recipes Calculations Per Group Calculating Top N items and Aggregating (sum) the remainder into "All other"

Calculating Top N items and Aggregating (sum) the remainder into "All Other"

The top / bottom N problem is often seen when ranking items in your database. We've previously written about the ranking problem in Calculating Top N items per Group.

Another requirement that co-exists with this question is to rollup all the other rows into a row called "All other". The reason for doing this is to provide more context when presenting data.

Example data

Consider a table with two columns: a salesperson's name and the amount of business they've closed:

Name Sales Alice 8405837 Bob 3884307 Chris 2718782 Dan 2195914 Eve 1553165 Fae 1513367 George 1409019 Hadi 1355896 Ida 1257676 Justine 998537 ... ...

1. Calculating simple top N items

Our first question is to list the top 10 salespeople by business closed, which can be accomplished with the following query. For more sophistication with window functions, see Calculating Top N items per Group:

select name, sales from deals order by 3 desc limit 10

name sales Alice 8405837 Bob 3884307 Chris 2718782 Dan 2195914 Eve 1553165 Fae 1513367 George 1409019 Hadi 1355896 Ida 1257676 Justine 998537

2. Aggregating into "All Other"

If we were to visualize the top salespeople as a pie chart, it'd look something like this:

The problem with the above chart is that we are missing data on how much business the other salespeople closed to put the visualization in context. We'll attempt to solve this question now.

What we are trying to accomplish is a final row appended to the results above that is the aggregation of the rest of the data, something like this:

name sales Alice 8405837 Bob 3884307 Chris 2718782 ... ... Justine 998537 All Other 7833544

The following query accomplishes that:

with top10 as (select name, sales from deals order by 2 desc limit 10) select * from top10 union all select 'All other' as name, sum(sales) as sales from sales where name not in (select name from top10)

How it works

We've organized the query into two queries and combined them with the union keyword. The first query is as before that finds the top 10 salespeople. The second builds upon this by excluding names that were previously found with the not in keyword. We've additionally "refactored" the core query as a CTE using the with keyword.